[2023-12-11 07:47:14,212][model3_sft.py][INFO] Epoch:[0/2](0/63764) loss:3.283 lr:0.0000000 epoch_Time:1061.0min: [2023-12-11 07:47:32,523][model3_sft.py][INFO] Epoch:[0/2](50/63764) loss:3.314 lr:0.0000010 epoch_Time:402.0min: [2023-12-11 07:47:50,825][model3_sft.py][INFO] Epoch:[0/2](100/63764) loss:2.511 lr:0.0000020 epoch_Time:395.0min: [2023-12-11 07:48:09,347][model3_sft.py][INFO] Epoch:[0/2](150/63764) loss:3.190 lr:0.0000030 epoch_Time:394.0min: [2023-12-11 07:48:27,684][model3_sft.py][INFO] Epoch:[0/2](200/63764) loss:2.976 lr:0.0000040 epoch_Time:392.0min: [2023-12-11 07:48:45,979][model3_sft.py][INFO] Epoch:[0/2](250/63764) loss:2.962 lr:0.0000050 epoch_Time:391.0min: [2023-12-11 07:49:04,289][model3_sft.py][INFO] Epoch:[0/2](300/63764) loss:2.714 lr:0.0000060 epoch_Time:391.0min: [2023-12-11 07:49:22,595][model3_sft.py][INFO] Epoch:[0/2](350/63764) loss:3.256 lr:0.0000070 epoch_Time:389.0min: [2023-12-11 07:49:40,907][model3_sft.py][INFO] Epoch:[0/2](400/63764) loss:3.277 lr:0.0000080 epoch_Time:389.0min: [2023-12-11 07:49:59,206][model3_sft.py][INFO] Epoch:[0/2](450/63764) loss:3.753 lr:0.0000090 epoch_Time:389.0min: [2023-12-11 07:50:17,539][model3_sft.py][INFO] Epoch:[0/2](500/63764) loss:2.974 lr:0.0000100 epoch_Time:387.0min: [2023-12-11 07:50:35,893][model3_sft.py][INFO] Epoch:[0/2](550/63764) loss:2.916 lr:0.0000110 epoch_Time:387.0min: [2023-12-11 07:50:54,229][model3_sft.py][INFO] Epoch:[0/2](600/63764) loss:3.176 lr:0.0000120 epoch_Time:387.0min: [2023-12-11 07:51:12,545][model3_sft.py][INFO] Epoch:[0/2](650/63764) loss:2.431 lr:0.0000130 epoch_Time:387.0min: [2023-12-11 07:51:30,862][model3_sft.py][INFO] Epoch:[0/2](700/63764) loss:3.223 lr:0.0000140 epoch_Time:386.0min: [2023-12-11 07:51:49,171][model3_sft.py][INFO] Epoch:[0/2](750/63764) loss:2.743 lr:0.0000150 epoch_Time:386.0min: [2023-12-11 07:52:07,495][model3_sft.py][INFO] Epoch:[0/2](800/63764) loss:3.102 lr:0.0000160 epoch_Time:386.0min: [2023-12-11 07:52:25,821][model3_sft.py][INFO] Epoch:[0/2](850/63764) loss:3.520 lr:0.0000170 epoch_Time:385.0min: [2023-12-11 07:52:44,151][model3_sft.py][INFO] Epoch:[0/2](900/63764) loss:3.268 lr:0.0000180 epoch_Time:385.0min: [2023-12-11 07:53:02,454][model3_sft.py][INFO] Epoch:[0/2](950/63764) loss:3.228 lr:0.0000190 epoch_Time:385.0min: [2023-12-11 07:53:20,760][model3_sft.py][INFO] Epoch:[0/2](1000/63764) loss:3.321 lr:0.0000200 epoch_Time:384.0min: [2023-12-11 07:53:39,083][model3_sft.py][INFO] Epoch:[0/2](1050/63764) loss:3.310 lr:0.0000200 epoch_Time:384.0min: [2023-12-11 07:53:57,605][model3_sft.py][INFO] Epoch:[0/2](1100/63764) loss:3.284 lr:0.0000200 epoch_Time:384.0min: [2023-12-11 07:54:15,958][model3_sft.py][INFO] Epoch:[0/2](1150/63764) loss:2.991 lr:0.0000200 epoch_Time:383.0min: [2023-12-11 07:54:34,303][model3_sft.py][INFO] Epoch:[0/2](1200/63764) loss:3.961 lr:0.0000200 epoch_Time:383.0min: [2023-12-11 07:54:52,641][model3_sft.py][INFO] Epoch:[0/2](1250/63764) loss:2.938 lr:0.0000200 epoch_Time:383.0min: [2023-12-11 07:55:10,982][model3_sft.py][INFO] Epoch:[0/2](1300/63764) loss:3.122 lr:0.0000200 epoch_Time:383.0min: [2023-12-11 07:55:29,328][model3_sft.py][INFO] Epoch:[0/2](1350/63764) loss:3.038 lr:0.0000200 epoch_Time:382.0min: [2023-12-11 07:55:47,661][model3_sft.py][INFO] Epoch:[0/2](1400/63764) loss:3.370 lr:0.0000200 epoch_Time:382.0min: [2023-12-11 07:56:06,000][model3_sft.py][INFO] Epoch:[0/2](1450/63764) loss:2.925 lr:0.0000200 epoch_Time:382.0min: [2023-12-11 07:56:24,313][model3_sft.py][INFO] Epoch:[0/2](1500/63764) loss:3.926 lr:0.0000200 epoch_Time:381.0min: [2023-12-11 07:56:42,655][model3_sft.py][INFO] Epoch:[0/2](1550/63764) loss:3.236 lr:0.0000200 epoch_Time:381.0min: [2023-12-11 07:57:01,027][model3_sft.py][INFO] Epoch:[0/2](1600/63764) loss:3.508 lr:0.0000200 epoch_Time:381.0min: [2023-12-11 07:57:19,338][model3_sft.py][INFO] Epoch:[0/2](1650/63764) loss:3.524 lr:0.0000200 epoch_Time:380.0min: [2023-12-11 07:57:37,673][model3_sft.py][INFO] Epoch:[0/2](1700/63764) loss:3.490 lr:0.0000200 epoch_Time:380.0min: [2023-12-11 07:57:55,979][model3_sft.py][INFO] Epoch:[0/2](1750/63764) loss:3.218 lr:0.0000200 epoch_Time:380.0min: [2023-12-11 07:58:14,306][model3_sft.py][INFO] Epoch:[0/2](1800/63764) loss:3.788 lr:0.0000200 epoch_Time:379.0min: [2023-12-11 07:58:32,637][model3_sft.py][INFO] Epoch:[0/2](1850/63764) loss:3.190 lr:0.0000200 epoch_Time:379.0min: [2023-12-11 07:58:50,974][model3_sft.py][INFO] Epoch:[0/2](1900/63764) loss:3.682 lr:0.0000200 epoch_Time:379.0min: [2023-12-11 07:59:09,286][model3_sft.py][INFO] Epoch:[0/2](1950/63764) loss:2.864 lr:0.0000200 epoch_Time:379.0min: [2023-12-11 07:59:27,622][model3_sft.py][INFO] Epoch:[0/2](2000/63764) loss:2.833 lr:0.0000200 epoch_Time:378.0min: [2023-12-11 07:59:46,176][model3_sft.py][INFO] Epoch:[0/2](2050/63764) loss:3.172 lr:0.0000200 epoch_Time:378.0min: [2023-12-11 08:00:04,517][model3_sft.py][INFO] Epoch:[0/2](2100/63764) loss:2.999 lr:0.0000200 epoch_Time:378.0min: [2023-12-11 08:00:22,826][model3_sft.py][INFO] Epoch:[0/2](2150/63764) loss:3.183 lr:0.0000200 epoch_Time:377.0min: [2023-12-11 08:00:41,165][model3_sft.py][INFO] Epoch:[0/2](2200/63764) loss:3.538 lr:0.0000200 epoch_Time:377.0min: [2023-12-11 08:00:59,518][model3_sft.py][INFO] Epoch:[0/2](2250/63764) loss:3.220 lr:0.0000200 epoch_Time:377.0min: [2023-12-11 08:01:17,848][model3_sft.py][INFO] Epoch:[0/2](2300/63764) loss:2.866 lr:0.0000200 epoch_Time:376.0min: [2023-12-11 08:01:36,192][model3_sft.py][INFO] Epoch:[0/2](2350/63764) loss:3.107 lr:0.0000200 epoch_Time:376.0min: [2023-12-11 08:01:54,505][model3_sft.py][INFO] Epoch:[0/2](2400/63764) loss:2.631 lr:0.0000200 epoch_Time:376.0min: [2023-12-11 08:02:12,843][model3_sft.py][INFO] Epoch:[0/2](2450/63764) loss:3.130 lr:0.0000200 epoch_Time:376.0min: [2023-12-11 08:02:31,190][model3_sft.py][INFO] Epoch:[0/2](2500/63764) loss:3.806 lr:0.0000200 epoch_Time:375.0min: [2023-12-11 08:02:49,569][model3_sft.py][INFO] Epoch:[0/2](2550/63764) loss:3.790 lr:0.0000200 epoch_Time:375.0min: [2023-12-11 08:03:07,899][model3_sft.py][INFO] Epoch:[0/2](2600/63764) loss:3.568 lr:0.0000200 epoch_Time:375.0min: [2023-12-11 08:03:26,221][model3_sft.py][INFO] Epoch:[0/2](2650/63764) loss:2.743 lr:0.0000199 epoch_Time:374.0min: [2023-12-11 08:03:44,551][model3_sft.py][INFO] Epoch:[0/2](2700/63764) loss:3.620 lr:0.0000199 epoch_Time:374.0min: [2023-12-11 08:04:02,865][model3_sft.py][INFO] Epoch:[0/2](2750/63764) loss:3.367 lr:0.0000199 epoch_Time:374.0min: [2023-12-11 08:04:21,204][model3_sft.py][INFO] Epoch:[0/2](2800/63764) loss:3.738 lr:0.0000199 epoch_Time:373.0min: [2023-12-11 08:04:39,560][model3_sft.py][INFO] Epoch:[0/2](2850/63764) loss:3.787 lr:0.0000199 epoch_Time:373.0min: [2023-12-11 08:04:57,884][model3_sft.py][INFO] Epoch:[0/2](2900/63764) loss:3.012 lr:0.0000199 epoch_Time:373.0min: [2023-12-11 08:05:16,237][model3_sft.py][INFO] Epoch:[0/2](2950/63764) loss:3.892 lr:0.0000199 epoch_Time:372.0min: [2023-12-11 08:05:34,566][model3_sft.py][INFO] Epoch:[0/2](3000/63764) loss:4.040 lr:0.0000199 epoch_Time:372.0min: [2023-12-11 08:05:53,104][model3_sft.py][INFO] Epoch:[0/2](3050/63764) loss:3.977 lr:0.0000199 epoch_Time:372.0min: [2023-12-11 08:06:11,480][model3_sft.py][INFO] Epoch:[0/2](3100/63764) loss:3.309 lr:0.0000199 epoch_Time:372.0min: [2023-12-11 08:06:29,846][model3_sft.py][INFO] Epoch:[0/2](3150/63764) loss:3.631 lr:0.0000199 epoch_Time:371.0min: [2023-12-11 08:06:48,176][model3_sft.py][INFO] Epoch:[0/2](3200/63764) loss:3.747 lr:0.0000199 epoch_Time:371.0min: [2023-12-11 08:07:06,540][model3_sft.py][INFO] Epoch:[0/2](3250/63764) loss:3.649 lr:0.0000199 epoch_Time:371.0min: [2023-12-11 08:07:24,883][model3_sft.py][INFO] Epoch:[0/2](3300/63764) loss:3.075 lr:0.0000199 epoch_Time:370.0min: [2023-12-11 08:07:43,223][model3_sft.py][INFO] Epoch:[0/2](3350/63764) loss:4.069 lr:0.0000199 epoch_Time:370.0min: [2023-12-11 08:08:01,591][model3_sft.py][INFO] Epoch:[0/2](3400/63764) loss:2.888 lr:0.0000199 epoch_Time:370.0min: [2023-12-11 08:08:19,943][model3_sft.py][INFO] Epoch:[0/2](3450/63764) loss:3.907 lr:0.0000199 epoch_Time:369.0min: [2023-12-11 08:08:38,287][model3_sft.py][INFO] Epoch:[0/2](3500/63764) loss:3.280 lr:0.0000199 epoch_Time:369.0min: [2023-12-11 08:08:56,642][model3_sft.py][INFO] Epoch:[0/2](3550/63764) loss:3.092 lr:0.0000199 epoch_Time:369.0min: [2023-12-11 08:09:14,991][model3_sft.py][INFO] Epoch:[0/2](3600/63764) loss:3.639 lr:0.0000199 epoch_Time:368.0min: [2023-12-11 08:09:33,336][model3_sft.py][INFO] Epoch:[0/2](3650/63764) loss:3.259 lr:0.0000199 epoch_Time:368.0min: [2023-12-11 08:09:51,694][model3_sft.py][INFO] Epoch:[0/2](3700/63764) loss:4.318 lr:0.0000199 epoch_Time:368.0min: [2023-12-11 08:10:10,082][model3_sft.py][INFO] Epoch:[0/2](3750/63764) loss:3.715 lr:0.0000199 epoch_Time:368.0min: [2023-12-11 08:10:28,444][model3_sft.py][INFO] Epoch:[0/2](3800/63764) loss:3.637 lr:0.0000198 epoch_Time:367.0min: [2023-12-11 08:10:46,785][model3_sft.py][INFO] Epoch:[0/2](3850/63764) loss:3.345 lr:0.0000198 epoch_Time:367.0min: [2023-12-11 08:11:05,156][model3_sft.py][INFO] Epoch:[0/2](3900/63764) loss:3.821 lr:0.0000198 epoch_Time:367.0min: [2023-12-11 08:11:23,518][model3_sft.py][INFO] Epoch:[0/2](3950/63764) loss:3.280 lr:0.0000198 epoch_Time:366.0min: [2023-12-11 08:11:42,098][model3_sft.py][INFO] Epoch:[0/2](4000/63764) loss:2.856 lr:0.0000198 epoch_Time:366.0min: [2023-12-11 08:12:00,485][model3_sft.py][INFO] Epoch:[0/2](4050/63764) loss:3.324 lr:0.0000198 epoch_Time:366.0min: [2023-12-11 08:12:18,903][model3_sft.py][INFO] Epoch:[0/2](4100/63764) loss:3.927 lr:0.0000198 epoch_Time:365.0min: [2023-12-11 08:12:37,327][model3_sft.py][INFO] Epoch:[0/2](4150/63764) loss:3.565 lr:0.0000198 epoch_Time:365.0min: [2023-12-11 08:12:55,716][model3_sft.py][INFO] Epoch:[0/2](4200/63764) loss:3.486 lr:0.0000198 epoch_Time:365.0min: [2023-12-11 08:13:14,086][model3_sft.py][INFO] Epoch:[0/2](4250/63764) loss:3.737 lr:0.0000198 epoch_Time:364.0min: [2023-12-11 08:13:32,447][model3_sft.py][INFO] Epoch:[0/2](4300/63764) loss:3.809 lr:0.0000198 epoch_Time:364.0min: [2023-12-11 08:13:50,818][model3_sft.py][INFO] Epoch:[0/2](4350/63764) loss:3.570 lr:0.0000198 epoch_Time:364.0min: [2023-12-11 08:14:09,210][model3_sft.py][INFO] Epoch:[0/2](4400/63764) loss:4.103 lr:0.0000198 epoch_Time:364.0min: [2023-12-11 08:14:27,571][model3_sft.py][INFO] Epoch:[0/2](4450/63764) loss:3.321 lr:0.0000198 epoch_Time:363.0min: [2023-12-11 08:14:45,931][model3_sft.py][INFO] Epoch:[0/2](4500/63764) loss:4.004 lr:0.0000198 epoch_Time:363.0min: [2023-12-11 08:15:04,304][model3_sft.py][INFO] Epoch:[0/2](4550/63764) loss:4.055 lr:0.0000198 epoch_Time:363.0min: [2023-12-11 08:15:22,636][model3_sft.py][INFO] Epoch:[0/2](4600/63764) loss:3.599 lr:0.0000197 epoch_Time:362.0min: [2023-12-11 08:15:40,995][model3_sft.py][INFO] Epoch:[0/2](4650/63764) loss:3.678 lr:0.0000197 epoch_Time:362.0min: [2023-12-11 08:15:59,364][model3_sft.py][INFO] Epoch:[0/2](4700/63764) loss:4.138 lr:0.0000197 epoch_Time:362.0min: [2023-12-11 08:16:17,771][model3_sft.py][INFO] Epoch:[0/2](4750/63764) loss:3.323 lr:0.0000197 epoch_Time:361.0min: [2023-12-11 08:16:36,124][model3_sft.py][INFO] Epoch:[0/2](4800/63764) loss:3.835 lr:0.0000197 epoch_Time:361.0min: [2023-12-11 08:16:54,484][model3_sft.py][INFO] Epoch:[0/2](4850/63764) loss:3.885 lr:0.0000197 epoch_Time:361.0min: [2023-12-11 08:17:12,890][model3_sft.py][INFO] Epoch:[0/2](4900/63764) loss:3.213 lr:0.0000197 epoch_Time:361.0min: [2023-12-11 08:17:31,445][model3_sft.py][INFO] Epoch:[0/2](4950/63764) loss:3.933 lr:0.0000197 epoch_Time:360.0min: [2023-12-11 08:17:49,784][model3_sft.py][INFO] Epoch:[0/2](5000/63764) loss:4.114 lr:0.0000197 epoch_Time:360.0min: [2023-12-11 08:18:08,163][model3_sft.py][INFO] Epoch:[0/2](5050/63764) loss:3.464 lr:0.0000197 epoch_Time:360.0min: [2023-12-11 08:18:26,539][model3_sft.py][INFO] Epoch:[0/2](5100/63764) loss:3.476 lr:0.0000197 epoch_Time:359.0min: [2023-12-11 08:18:44,893][model3_sft.py][INFO] Epoch:[0/2](5150/63764) loss:4.896 lr:0.0000197 epoch_Time:359.0min: [2023-12-11 08:19:03,239][model3_sft.py][INFO] Epoch:[0/2](5200/63764) loss:3.318 lr:0.0000197 epoch_Time:359.0min: [2023-12-11 08:19:21,574][model3_sft.py][INFO] Epoch:[0/2](5250/63764) loss:3.986 lr:0.0000196 epoch_Time:358.0min: [2023-12-11 08:19:39,955][model3_sft.py][INFO] Epoch:[0/2](5300/63764) loss:4.078 lr:0.0000196 epoch_Time:358.0min: [2023-12-11 08:19:58,315][model3_sft.py][INFO] Epoch:[0/2](5350/63764) loss:3.841 lr:0.0000196 epoch_Time:358.0min: [2023-12-11 08:20:16,673][model3_sft.py][INFO] Epoch:[0/2](5400/63764) loss:3.702 lr:0.0000196 epoch_Time:357.0min: [2023-12-11 08:20:35,029][model3_sft.py][INFO] Epoch:[0/2](5450/63764) loss:3.574 lr:0.0000196 epoch_Time:357.0min: [2023-12-11 08:20:53,387][model3_sft.py][INFO] Epoch:[0/2](5500/63764) loss:3.734 lr:0.0000196 epoch_Time:357.0min: [2023-12-11 08:21:11,746][model3_sft.py][INFO] Epoch:[0/2](5550/63764) loss:3.325 lr:0.0000196 epoch_Time:357.0min: [2023-12-11 08:21:30,077][model3_sft.py][INFO] Epoch:[0/2](5600/63764) loss:3.475 lr:0.0000196 epoch_Time:356.0min: [2023-12-11 08:21:48,438][model3_sft.py][INFO] Epoch:[0/2](5650/63764) loss:3.889 lr:0.0000196 epoch_Time:356.0min: [2023-12-11 08:22:06,787][model3_sft.py][INFO] Epoch:[0/2](5700/63764) loss:3.839 lr:0.0000196 epoch_Time:356.0min: [2023-12-11 08:22:25,151][model3_sft.py][INFO] Epoch:[0/2](5750/63764) loss:4.303 lr:0.0000196 epoch_Time:355.0min: [2023-12-11 08:22:43,485][model3_sft.py][INFO] Epoch:[0/2](5800/63764) loss:3.810 lr:0.0000196 epoch_Time:355.0min: [2023-12-11 08:23:01,825][model3_sft.py][INFO] Epoch:[0/2](5850/63764) loss:3.397 lr:0.0000195 epoch_Time:355.0min: [2023-12-11 08:23:20,437][model3_sft.py][INFO] Epoch:[0/2](5900/63764) loss:4.376 lr:0.0000195 epoch_Time:354.0min: [2023-12-11 08:23:38,809][model3_sft.py][INFO] Epoch:[0/2](5950/63764) loss:3.637 lr:0.0000195 epoch_Time:354.0min: [2023-12-11 08:23:57,193][model3_sft.py][INFO] Epoch:[0/2](6000/63764) loss:4.102 lr:0.0000195 epoch_Time:354.0min: [2023-12-11 08:24:15,523][model3_sft.py][INFO] Epoch:[0/2](6050/63764) loss:3.373 lr:0.0000195 epoch_Time:353.0min: [2023-12-11 08:24:33,867][model3_sft.py][INFO] Epoch:[0/2](6100/63764) loss:3.253 lr:0.0000195 epoch_Time:353.0min: [2023-12-11 08:24:52,221][model3_sft.py][INFO] Epoch:[0/2](6150/63764) loss:3.837 lr:0.0000195 epoch_Time:353.0min: [2023-12-11 08:25:10,576][model3_sft.py][INFO] Epoch:[0/2](6200/63764) loss:3.314 lr:0.0000195 epoch_Time:353.0min: [2023-12-11 08:25:28,931][model3_sft.py][INFO] Epoch:[0/2](6250/63764) loss:4.379 lr:0.0000195 epoch_Time:352.0min: [2023-12-11 08:25:47,275][model3_sft.py][INFO] Epoch:[0/2](6300/63764) loss:3.929 lr:0.0000195 epoch_Time:352.0min: [2023-12-11 08:26:05,635][model3_sft.py][INFO] Epoch:[0/2](6350/63764) loss:3.173 lr:0.0000194 epoch_Time:352.0min: [2023-12-11 08:26:24,015][model3_sft.py][INFO] Epoch:[0/2](6400/63764) loss:4.540 lr:0.0000194 epoch_Time:351.0min: [2023-12-11 08:26:42,356][model3_sft.py][INFO] Epoch:[0/2](6450/63764) loss:3.933 lr:0.0000194 epoch_Time:351.0min: [2023-12-11 08:27:00,726][model3_sft.py][INFO] Epoch:[0/2](6500/63764) loss:3.587 lr:0.0000194 epoch_Time:351.0min: [2023-12-11 08:27:19,108][model3_sft.py][INFO] Epoch:[0/2](6550/63764) loss:2.874 lr:0.0000194 epoch_Time:350.0min: [2023-12-11 08:27:37,472][model3_sft.py][INFO] Epoch:[0/2](6600/63764) loss:3.845 lr:0.0000194 epoch_Time:350.0min: [2023-12-11 08:27:55,853][model3_sft.py][INFO] Epoch:[0/2](6650/63764) loss:4.106 lr:0.0000194 epoch_Time:350.0min: [2023-12-11 08:28:14,229][model3_sft.py][INFO] Epoch:[0/2](6700/63764) loss:3.580 lr:0.0000194 epoch_Time:349.0min: [2023-12-11 08:28:32,608][model3_sft.py][INFO] Epoch:[0/2](6750/63764) loss:3.493 lr:0.0000194 epoch_Time:349.0min: [2023-12-11 08:28:50,976][model3_sft.py][INFO] Epoch:[0/2](6800/63764) loss:3.798 lr:0.0000194 epoch_Time:349.0min: [2023-12-11 08:29:09,600][model3_sft.py][INFO] Epoch:[0/2](6850/63764) loss:3.577 lr:0.0000193 epoch_Time:349.0min: [2023-12-11 08:29:27,984][model3_sft.py][INFO] Epoch:[0/2](6900/63764) loss:4.152 lr:0.0000193 epoch_Time:348.0min: [2023-12-11 08:29:46,327][model3_sft.py][INFO] Epoch:[0/2](6950/63764) loss:3.498 lr:0.0000193 epoch_Time:348.0min: [2023-12-11 08:30:04,695][model3_sft.py][INFO] Epoch:[0/2](7000/63764) loss:3.448 lr:0.0000193 epoch_Time:348.0min: [2023-12-11 08:30:23,060][model3_sft.py][INFO] Epoch:[0/2](7050/63764) loss:3.405 lr:0.0000193 epoch_Time:347.0min: [2023-12-11 08:30:41,421][model3_sft.py][INFO] Epoch:[0/2](7100/63764) loss:3.667 lr:0.0000193 epoch_Time:347.0min: [2023-12-11 08:30:59,791][model3_sft.py][INFO] Epoch:[0/2](7150/63764) loss:3.384 lr:0.0000193 epoch_Time:347.0min: [2023-12-11 08:31:18,184][model3_sft.py][INFO] Epoch:[0/2](7200/63764) loss:3.924 lr:0.0000193 epoch_Time:346.0min: [2023-12-11 08:31:36,580][model3_sft.py][INFO] Epoch:[0/2](7250/63764) loss:3.931 lr:0.0000192 epoch_Time:346.0min: [2023-12-11 08:31:54,938][model3_sft.py][INFO] Epoch:[0/2](7300/63764) loss:3.634 lr:0.0000192 epoch_Time:346.0min: [2023-12-11 08:32:13,330][model3_sft.py][INFO] Epoch:[0/2](7350/63764) loss:3.387 lr:0.0000192 epoch_Time:345.0min: [2023-12-11 08:32:31,728][model3_sft.py][INFO] Epoch:[0/2](7400/63764) loss:3.449 lr:0.0000192 epoch_Time:345.0min: [2023-12-11 08:32:50,096][model3_sft.py][INFO] Epoch:[0/2](7450/63764) loss:3.337 lr:0.0000192 epoch_Time:345.0min: [2023-12-11 08:33:08,479][model3_sft.py][INFO] Epoch:[0/2](7500/63764) loss:3.202 lr:0.0000192 epoch_Time:345.0min: [2023-12-11 08:33:26,863][model3_sft.py][INFO] Epoch:[0/2](7550/63764) loss:3.710 lr:0.0000192 epoch_Time:344.0min: [2023-12-11 08:33:45,250][model3_sft.py][INFO] Epoch:[0/2](7600/63764) loss:3.620 lr:0.0000192 epoch_Time:344.0min: [2023-12-11 08:34:03,633][model3_sft.py][INFO] Epoch:[0/2](7650/63764) loss:3.622 lr:0.0000191 epoch_Time:344.0min: [2023-12-11 08:34:21,978][model3_sft.py][INFO] Epoch:[0/2](7700/63764) loss:3.981 lr:0.0000191 epoch_Time:343.0min: [2023-12-11 08:34:40,356][model3_sft.py][INFO] Epoch:[0/2](7750/63764) loss:4.487 lr:0.0000191 epoch_Time:343.0min: [2023-12-11 08:34:58,763][model3_sft.py][INFO] Epoch:[0/2](7800/63764) loss:3.807 lr:0.0000191 epoch_Time:343.0min: [2023-12-11 08:35:17,376][model3_sft.py][INFO] Epoch:[0/2](7850/63764) loss:3.105 lr:0.0000191 epoch_Time:342.0min: [2023-12-11 08:35:35,754][model3_sft.py][INFO] Epoch:[0/2](7900/63764) loss:4.174 lr:0.0000191 epoch_Time:342.0min: [2023-12-11 08:35:54,148][model3_sft.py][INFO] Epoch:[0/2](7950/63764) loss:3.670 lr:0.0000191 epoch_Time:342.0min: [2023-12-11 08:36:12,525][model3_sft.py][INFO] Epoch:[0/2](8000/63764) loss:3.648 lr:0.0000191 epoch_Time:342.0min: [2023-12-11 08:36:30,910][model3_sft.py][INFO] Epoch:[0/2](8050/63764) loss:3.965 lr:0.0000190 epoch_Time:341.0min: [2023-12-11 08:36:49,301][model3_sft.py][INFO] Epoch:[0/2](8100/63764) loss:3.779 lr:0.0000190 epoch_Time:341.0min: [2023-12-11 08:37:07,685][model3_sft.py][INFO] Epoch:[0/2](8150/63764) loss:4.010 lr:0.0000190 epoch_Time:341.0min: [2023-12-11 08:37:26,060][model3_sft.py][INFO] Epoch:[0/2](8200/63764) loss:3.903 lr:0.0000190 epoch_Time:340.0min: [2023-12-11 08:37:44,447][model3_sft.py][INFO] Epoch:[0/2](8250/63764) loss:3.973 lr:0.0000190 epoch_Time:340.0min: [2023-12-11 08:38:02,877][model3_sft.py][INFO] Epoch:[0/2](8300/63764) loss:3.918 lr:0.0000190 epoch_Time:340.0min: [2023-12-11 08:38:21,290][model3_sft.py][INFO] Epoch:[0/2](8350/63764) loss:4.337 lr:0.0000190 epoch_Time:339.0min: [2023-12-11 08:38:39,643][model3_sft.py][INFO] Epoch:[0/2](8400/63764) loss:3.388 lr:0.0000190 epoch_Time:339.0min: [2023-12-11 08:38:58,020][model3_sft.py][INFO] Epoch:[0/2](8450/63764) loss:3.439 lr:0.0000189 epoch_Time:339.0min: [2023-12-11 08:39:16,393][model3_sft.py][INFO] Epoch:[0/2](8500/63764) loss:3.920 lr:0.0000189 epoch_Time:338.0min: [2023-12-11 08:39:34,802][model3_sft.py][INFO] Epoch:[0/2](8550/63764) loss:3.702 lr:0.0000189 epoch_Time:338.0min: [2023-12-11 08:39:53,155][model3_sft.py][INFO] Epoch:[0/2](8600/63764) loss:4.346 lr:0.0000189 epoch_Time:338.0min: [2023-12-11 08:40:11,551][model3_sft.py][INFO] Epoch:[0/2](8650/63764) loss:3.694 lr:0.0000189 epoch_Time:338.0min: [2023-12-11 08:40:29,938][model3_sft.py][INFO] Epoch:[0/2](8700/63764) loss:3.626 lr:0.0000189 epoch_Time:337.0min: [2023-12-11 08:40:48,328][model3_sft.py][INFO] Epoch:[0/2](8750/63764) loss:4.056 lr:0.0000189 epoch_Time:337.0min: [2023-12-11 08:41:06,932][model3_sft.py][INFO] Epoch:[0/2](8800/63764) loss:3.861 lr:0.0000188 epoch_Time:337.0min: [2023-12-11 08:41:25,284][model3_sft.py][INFO] Epoch:[0/2](8850/63764) loss:3.837 lr:0.0000188 epoch_Time:336.0min: [2023-12-11 08:41:43,642][model3_sft.py][INFO] Epoch:[0/2](8900/63764) loss:3.356 lr:0.0000188 epoch_Time:336.0min: [2023-12-11 08:42:02,035][model3_sft.py][INFO] Epoch:[0/2](8950/63764) loss:4.130 lr:0.0000188 epoch_Time:336.0min: [2023-12-11 08:42:20,420][model3_sft.py][INFO] Epoch:[0/2](9000/63764) loss:3.339 lr:0.0000188 epoch_Time:335.0min: [2023-12-11 08:42:38,803][model3_sft.py][INFO] Epoch:[0/2](9050/63764) loss:4.624 lr:0.0000188 epoch_Time:335.0min: [2023-12-11 08:42:57,156][model3_sft.py][INFO] Epoch:[0/2](9100/63764) loss:4.411 lr:0.0000187 epoch_Time:335.0min: [2023-12-11 08:43:15,553][model3_sft.py][INFO] Epoch:[0/2](9150/63764) loss:3.815 lr:0.0000187 epoch_Time:334.0min: [2023-12-11 08:43:33,990][model3_sft.py][INFO] Epoch:[0/2](9200/63764) loss:3.242 lr:0.0000187 epoch_Time:334.0min: [2023-12-11 08:43:52,321][model3_sft.py][INFO] Epoch:[0/2](9250/63764) loss:3.328 lr:0.0000187 epoch_Time:334.0min: [2023-12-11 08:44:10,709][model3_sft.py][INFO] Epoch:[0/2](9300/63764) loss:3.206 lr:0.0000187 epoch_Time:334.0min: [2023-12-11 08:44:29,124][model3_sft.py][INFO] Epoch:[0/2](9350/63764) loss:3.606 lr:0.0000187 epoch_Time:333.0min: [2023-12-11 08:44:47,496][model3_sft.py][INFO] Epoch:[0/2](9400/63764) loss:3.334 lr:0.0000187 epoch_Time:333.0min: [2023-12-11 08:45:05,867][model3_sft.py][INFO] Epoch:[0/2](9450/63764) loss:3.734 lr:0.0000186 epoch_Time:333.0min: [2023-12-11 08:45:24,250][model3_sft.py][INFO] Epoch:[0/2](9500/63764) loss:3.261 lr:0.0000186 epoch_Time:332.0min: [2023-12-11 08:45:42,595][model3_sft.py][INFO] Epoch:[0/2](9550/63764) loss:3.931 lr:0.0000186 epoch_Time:332.0min: [2023-12-11 08:46:00,995][model3_sft.py][INFO] Epoch:[0/2](9600/63764) loss:3.881 lr:0.0000186 epoch_Time:332.0min: [2023-12-11 08:46:19,358][model3_sft.py][INFO] Epoch:[0/2](9650/63764) loss:4.215 lr:0.0000186 epoch_Time:331.0min: [2023-12-11 08:46:37,753][model3_sft.py][INFO] Epoch:[0/2](9700/63764) loss:4.355 lr:0.0000186 epoch_Time:331.0min: [2023-12-11 08:46:56,332][model3_sft.py][INFO] Epoch:[0/2](9750/63764) loss:3.680 lr:0.0000185 epoch_Time:331.0min: [2023-12-11 08:47:14,706][model3_sft.py][INFO] Epoch:[0/2](9800/63764) loss:4.437 lr:0.0000185 epoch_Time:330.0min: [2023-12-11 08:47:33,089][model3_sft.py][INFO] Epoch:[0/2](9850/63764) loss:4.049 lr:0.0000185 epoch_Time:330.0min: [2023-12-11 08:47:51,490][model3_sft.py][INFO] Epoch:[0/2](9900/63764) loss:3.716 lr:0.0000185 epoch_Time:330.0min: [2023-12-11 08:48:09,869][model3_sft.py][INFO] Epoch:[0/2](9950/63764) loss:3.570 lr:0.0000185 epoch_Time:330.0min: [2023-12-11 08:48:28,221][model3_sft.py][INFO] Epoch:[0/2](10000/63764) loss:3.040 lr:0.0000185 epoch_Time:329.0min: [2023-12-11 08:48:46,610][model3_sft.py][INFO] Epoch:[0/2](10050/63764) loss:3.391 lr:0.0000184 epoch_Time:329.0min: [2023-12-11 08:49:05,001][model3_sft.py][INFO] Epoch:[0/2](10100/63764) loss:4.391 lr:0.0000184 epoch_Time:329.0min: [2023-12-11 08:49:23,367][model3_sft.py][INFO] Epoch:[0/2](10150/63764) loss:3.100 lr:0.0000184 epoch_Time:328.0min: [2023-12-11 08:49:41,772][model3_sft.py][INFO] Epoch:[0/2](10200/63764) loss:3.562 lr:0.0000184 epoch_Time:328.0min: [2023-12-11 08:50:00,217][model3_sft.py][INFO] Epoch:[0/2](10250/63764) loss:3.784 lr:0.0000184 epoch_Time:328.0min: [2023-12-11 08:50:18,665][model3_sft.py][INFO] Epoch:[0/2](10300/63764) loss:3.079 lr:0.0000184 epoch_Time:327.0min: [2023-12-11 08:50:37,042][model3_sft.py][INFO] Epoch:[0/2](10350/63764) loss:3.351 lr:0.0000183 epoch_Time:327.0min: [2023-12-11 08:50:55,454][model3_sft.py][INFO] Epoch:[0/2](10400/63764) loss:3.156 lr:0.0000183 epoch_Time:327.0min: [2023-12-11 08:51:13,875][model3_sft.py][INFO] Epoch:[0/2](10450/63764) loss:3.154 lr:0.0000183 epoch_Time:326.0min: [2023-12-11 08:51:32,247][model3_sft.py][INFO] Epoch:[0/2](10500/63764) loss:3.278 lr:0.0000183 epoch_Time:326.0min: [2023-12-11 08:51:50,590][model3_sft.py][INFO] Epoch:[0/2](10550/63764) loss:4.038 lr:0.0000183 epoch_Time:326.0min: [2023-12-11 08:52:08,957][model3_sft.py][INFO] Epoch:[0/2](10600/63764) loss:4.311 lr:0.0000183 epoch_Time:326.0min: [2023-12-11 08:52:27,322][model3_sft.py][INFO] Epoch:[0/2](10650/63764) loss:3.820 lr:0.0000182 epoch_Time:325.0min: [2023-12-11 08:52:45,942][model3_sft.py][INFO] Epoch:[0/2](10700/63764) loss:3.644 lr:0.0000182 epoch_Time:325.0min: [2023-12-11 08:53:04,333][model3_sft.py][INFO] Epoch:[0/2](10750/63764) loss:3.790 lr:0.0000182 epoch_Time:325.0min: [2023-12-11 08:53:22,696][model3_sft.py][INFO] Epoch:[0/2](10800/63764) loss:3.966 lr:0.0000182 epoch_Time:324.0min: [2023-12-11 08:53:41,049][model3_sft.py][INFO] Epoch:[0/2](10850/63764) loss:3.535 lr:0.0000182 epoch_Time:324.0min: [2023-12-11 08:53:59,410][model3_sft.py][INFO] Epoch:[0/2](10900/63764) loss:4.338 lr:0.0000181 epoch_Time:324.0min: [2023-12-11 08:54:17,807][model3_sft.py][INFO] Epoch:[0/2](10950/63764) loss:3.919 lr:0.0000181 epoch_Time:323.0min: [2023-12-11 08:54:36,192][model3_sft.py][INFO] Epoch:[0/2](11000/63764) loss:3.807 lr:0.0000181 epoch_Time:323.0min: [2023-12-11 08:54:54,554][model3_sft.py][INFO] Epoch:[0/2](11050/63764) loss:3.605 lr:0.0000181 epoch_Time:323.0min: [2023-12-11 08:55:12,929][model3_sft.py][INFO] Epoch:[0/2](11100/63764) loss:3.006 lr:0.0000181 epoch_Time:323.0min: [2023-12-11 08:55:31,317][model3_sft.py][INFO] Epoch:[0/2](11150/63764) loss:4.078 lr:0.0000181 epoch_Time:322.0min: [2023-12-11 08:55:49,673][model3_sft.py][INFO] Epoch:[0/2](11200/63764) loss:3.496 lr:0.0000180 epoch_Time:322.0min: [2023-12-11 08:56:08,065][model3_sft.py][INFO] Epoch:[0/2](11250/63764) loss:3.585 lr:0.0000180 epoch_Time:322.0min: [2023-12-11 08:56:26,481][model3_sft.py][INFO] Epoch:[0/2](11300/63764) loss:3.944 lr:0.0000180 epoch_Time:321.0min: [2023-12-11 08:56:44,864][model3_sft.py][INFO] Epoch:[0/2](11350/63764) loss:3.644 lr:0.0000180 epoch_Time:321.0min: [2023-12-11 08:57:03,248][model3_sft.py][INFO] Epoch:[0/2](11400/63764) loss:3.390 lr:0.0000180 epoch_Time:321.0min: [2023-12-11 08:57:21,605][model3_sft.py][INFO] Epoch:[0/2](11450/63764) loss:3.731 lr:0.0000179 epoch_Time:320.0min: [2023-12-11 08:57:39,981][model3_sft.py][INFO] Epoch:[0/2](11500/63764) loss:3.393 lr:0.0000179 epoch_Time:320.0min: [2023-12-11 08:57:58,355][model3_sft.py][INFO] Epoch:[0/2](11550/63764) loss:4.458 lr:0.0000179 epoch_Time:320.0min: [2023-12-11 08:58:16,769][model3_sft.py][INFO] Epoch:[0/2](11600/63764) loss:3.739 lr:0.0000179 epoch_Time:319.0min: [2023-12-11 08:58:35,370][model3_sft.py][INFO] Epoch:[0/2](11650/63764) loss:2.673 lr:0.0000179 epoch_Time:319.0min: [2023-12-11 08:58:53,767][model3_sft.py][INFO] Epoch:[0/2](11700/63764) loss:3.798 lr:0.0000179 epoch_Time:319.0min: [2023-12-11 08:59:12,133][model3_sft.py][INFO] Epoch:[0/2](11750/63764) loss:3.676 lr:0.0000178 epoch_Time:319.0min: [2023-12-11 08:59:30,590][model3_sft.py][INFO] Epoch:[0/2](11800/63764) loss:3.801 lr:0.0000178 epoch_Time:318.0min: [2023-12-11 08:59:48,949][model3_sft.py][INFO] Epoch:[0/2](11850/63764) loss:3.802 lr:0.0000178 epoch_Time:318.0min: [2023-12-11 09:00:07,286][model3_sft.py][INFO] Epoch:[0/2](11900/63764) loss:3.726 lr:0.0000178 epoch_Time:318.0min: [2023-12-11 09:00:25,663][model3_sft.py][INFO] Epoch:[0/2](11950/63764) loss:2.993 lr:0.0000178 epoch_Time:317.0min: [2023-12-11 09:00:44,048][model3_sft.py][INFO] Epoch:[0/2](12000/63764) loss:3.216 lr:0.0000177 epoch_Time:317.0min: [2023-12-11 09:01:02,459][model3_sft.py][INFO] Epoch:[0/2](12050/63764) loss:2.779 lr:0.0000177 epoch_Time:317.0min: [2023-12-11 09:01:20,816][model3_sft.py][INFO] Epoch:[0/2](12100/63764) loss:3.724 lr:0.0000177 epoch_Time:316.0min: [2023-12-11 09:01:39,204][model3_sft.py][INFO] Epoch:[0/2](12150/63764) loss:4.219 lr:0.0000177 epoch_Time:316.0min: [2023-12-11 09:01:57,606][model3_sft.py][INFO] Epoch:[0/2](12200/63764) loss:3.500 lr:0.0000177 epoch_Time:316.0min: [2023-12-11 09:02:16,006][model3_sft.py][INFO] Epoch:[0/2](12250/63764) loss:3.503 lr:0.0000176 epoch_Time:315.0min: [2023-12-11 09:02:34,424][model3_sft.py][INFO] Epoch:[0/2](12300/63764) loss:3.747 lr:0.0000176 epoch_Time:315.0min: [2023-12-11 09:02:52,791][model3_sft.py][INFO] Epoch:[0/2](12350/63764) loss:3.131 lr:0.0000176 epoch_Time:315.0min: [2023-12-11 09:03:11,184][model3_sft.py][INFO] Epoch:[0/2](12400/63764) loss:2.950 lr:0.0000176 epoch_Time:315.0min: [2023-12-11 09:03:29,585][model3_sft.py][INFO] Epoch:[0/2](12450/63764) loss:3.556 lr:0.0000176 epoch_Time:314.0min: [2023-12-11 09:03:48,014][model3_sft.py][INFO] Epoch:[0/2](12500/63764) loss:3.233 lr:0.0000175 epoch_Time:314.0min: [2023-12-11 09:04:06,397][model3_sft.py][INFO] Epoch:[0/2](12550/63764) loss:2.649 lr:0.0000175 epoch_Time:314.0min: [2023-12-11 09:04:24,773][model3_sft.py][INFO] Epoch:[0/2](12600/63764) loss:3.431 lr:0.0000175 epoch_Time:313.0min: [2023-12-11 09:04:43,326][model3_sft.py][INFO] Epoch:[0/2](12650/63764) loss:3.442 lr:0.0000175 epoch_Time:313.0min: [2023-12-11 09:05:01,737][model3_sft.py][INFO] Epoch:[0/2](12700/63764) loss:3.669 lr:0.0000175 epoch_Time:313.0min: [2023-12-11 09:05:20,112][model3_sft.py][INFO] Epoch:[0/2](12750/63764) loss:4.489 lr:0.0000174 epoch_Time:312.0min: [2023-12-11 09:05:38,457][model3_sft.py][INFO] Epoch:[0/2](12800/63764) loss:3.308 lr:0.0000174 epoch_Time:312.0min: [2023-12-11 09:05:56,853][model3_sft.py][INFO] Epoch:[0/2](12850/63764) loss:3.410 lr:0.0000174 epoch_Time:312.0min: [2023-12-11 09:06:15,239][model3_sft.py][INFO] Epoch:[0/2](12900/63764) loss:3.863 lr:0.0000174 epoch_Time:311.0min: [2023-12-11 09:06:33,605][model3_sft.py][INFO] Epoch:[0/2](12950/63764) loss:3.722 lr:0.0000173 epoch_Time:311.0min: [2023-12-11 09:06:52,019][model3_sft.py][INFO] Epoch:[0/2](13000/63764) loss:3.817 lr:0.0000173 epoch_Time:311.0min: [2023-12-11 09:07:10,390][model3_sft.py][INFO] Epoch:[0/2](13050/63764) loss:4.000 lr:0.0000173 epoch_Time:311.0min: [2023-12-11 09:07:28,780][model3_sft.py][INFO] Epoch:[0/2](13100/63764) loss:4.016 lr:0.0000173 epoch_Time:310.0min: [2023-12-11 09:07:47,164][model3_sft.py][INFO] Epoch:[0/2](13150/63764) loss:3.370 lr:0.0000173 epoch_Time:310.0min: [2023-12-11 09:08:05,543][model3_sft.py][INFO] Epoch:[0/2](13200/63764) loss:3.644 lr:0.0000172 epoch_Time:310.0min: [2023-12-11 09:08:23,893][model3_sft.py][INFO] Epoch:[0/2](13250/63764) loss:3.635 lr:0.0000172 epoch_Time:309.0min: [2023-12-11 09:08:42,262][model3_sft.py][INFO] Epoch:[0/2](13300/63764) loss:4.433 lr:0.0000172 epoch_Time:309.0min: [2023-12-11 09:09:00,627][model3_sft.py][INFO] Epoch:[0/2](13350/63764) loss:3.738 lr:0.0000172 epoch_Time:309.0min: [2023-12-11 09:09:18,993][model3_sft.py][INFO] Epoch:[0/2](13400/63764) loss:3.506 lr:0.0000172 epoch_Time:308.0min: [2023-12-11 09:09:37,344][model3_sft.py][INFO] Epoch:[0/2](13450/63764) loss:4.005 lr:0.0000171 epoch_Time:308.0min: [2023-12-11 09:09:55,693][model3_sft.py][INFO] Epoch:[0/2](13500/63764) loss:3.472 lr:0.0000171 epoch_Time:308.0min: [2023-12-11 09:10:14,083][model3_sft.py][INFO] Epoch:[0/2](13550/63764) loss:3.417 lr:0.0000171 epoch_Time:307.0min: [2023-12-11 09:10:32,658][model3_sft.py][INFO] Epoch:[0/2](13600/63764) loss:3.285 lr:0.0000171 epoch_Time:307.0min: [2023-12-11 09:10:51,055][model3_sft.py][INFO] Epoch:[0/2](13650/63764) loss:3.517 lr:0.0000170 epoch_Time:307.0min: [2023-12-11 09:11:09,446][model3_sft.py][INFO] Epoch:[0/2](13700/63764) loss:3.387 lr:0.0000170 epoch_Time:307.0min: [2023-12-11 09:11:27,815][model3_sft.py][INFO] Epoch:[0/2](13750/63764) loss:3.629 lr:0.0000170 epoch_Time:306.0min: [2023-12-11 09:11:46,186][model3_sft.py][INFO] Epoch:[0/2](13800/63764) loss:3.521 lr:0.0000170 epoch_Time:306.0min: [2023-12-11 09:12:04,569][model3_sft.py][INFO] Epoch:[0/2](13850/63764) loss:4.136 lr:0.0000170 epoch_Time:306.0min: [2023-12-11 09:12:22,957][model3_sft.py][INFO] Epoch:[0/2](13900/63764) loss:3.330 lr:0.0000169 epoch_Time:305.0min: [2023-12-11 09:12:41,373][model3_sft.py][INFO] Epoch:[0/2](13950/63764) loss:3.688 lr:0.0000169 epoch_Time:305.0min: [2023-12-11 09:12:59,799][model3_sft.py][INFO] Epoch:[0/2](14000/63764) loss:3.663 lr:0.0000169 epoch_Time:305.0min: [2023-12-11 09:13:18,204][model3_sft.py][INFO] Epoch:[0/2](14050/63764) loss:3.659 lr:0.0000169 epoch_Time:304.0min: [2023-12-11 09:13:36,565][model3_sft.py][INFO] Epoch:[0/2](14100/63764) loss:4.414 lr:0.0000168 epoch_Time:304.0min: [2023-12-11 09:13:54,925][model3_sft.py][INFO] Epoch:[0/2](14150/63764) loss:3.431 lr:0.0000168 epoch_Time:304.0min: [2023-12-11 09:14:13,356][model3_sft.py][INFO] Epoch:[0/2](14200/63764) loss:3.984 lr:0.0000168 epoch_Time:303.0min: [2023-12-11 09:14:31,772][model3_sft.py][INFO] Epoch:[0/2](14250/63764) loss:3.234 lr:0.0000168 epoch_Time:303.0min: [2023-12-11 09:14:50,173][model3_sft.py][INFO] Epoch:[0/2](14300/63764) loss:3.943 lr:0.0000168 epoch_Time:303.0min: [2023-12-11 09:15:08,545][model3_sft.py][INFO] Epoch:[0/2](14350/63764) loss:3.798 lr:0.0000167 epoch_Time:303.0min: [2023-12-11 09:15:26,940][model3_sft.py][INFO] Epoch:[0/2](14400/63764) loss:3.577 lr:0.0000167 epoch_Time:302.0min: [2023-12-11 09:15:45,301][model3_sft.py][INFO] Epoch:[0/2](14450/63764) loss:3.745 lr:0.0000167 epoch_Time:302.0min: [2023-12-11 09:16:03,660][model3_sft.py][INFO] Epoch:[0/2](14500/63764) loss:3.117 lr:0.0000167 epoch_Time:302.0min: [2023-12-11 09:16:22,245][model3_sft.py][INFO] Epoch:[0/2](14550/63764) loss:3.853 lr:0.0000166 epoch_Time:301.0min: [2023-12-11 09:16:40,646][model3_sft.py][INFO] Epoch:[0/2](14600/63764) loss:3.766 lr:0.0000166 epoch_Time:301.0min: [2023-12-11 09:16:59,045][model3_sft.py][INFO] Epoch:[0/2](14650/63764) loss:4.141 lr:0.0000166 epoch_Time:301.0min: [2023-12-11 09:17:17,437][model3_sft.py][INFO] Epoch:[0/2](14700/63764) loss:4.084 lr:0.0000166 epoch_Time:300.0min: [2023-12-11 09:17:35,835][model3_sft.py][INFO] Epoch:[0/2](14750/63764) loss:3.416 lr:0.0000165 epoch_Time:300.0min: [2023-12-11 09:17:54,265][model3_sft.py][INFO] Epoch:[0/2](14800/63764) loss:3.391 lr:0.0000165 epoch_Time:300.0min: [2023-12-11 09:18:12,665][model3_sft.py][INFO] Epoch:[0/2](14850/63764) loss:3.616 lr:0.0000165 epoch_Time:300.0min: [2023-12-11 09:18:31,048][model3_sft.py][INFO] Epoch:[0/2](14900/63764) loss:3.736 lr:0.0000165 epoch_Time:299.0min: [2023-12-11 09:18:49,481][model3_sft.py][INFO] Epoch:[0/2](14950/63764) loss:4.240 lr:0.0000164 epoch_Time:299.0min: [2023-12-11 09:19:07,898][model3_sft.py][INFO] Epoch:[0/2](15000/63764) loss:3.660 lr:0.0000164 epoch_Time:299.0min: [2023-12-11 09:19:26,280][model3_sft.py][INFO] Epoch:[0/2](15050/63764) loss:3.609 lr:0.0000164 epoch_Time:298.0min: [2023-12-11 09:19:44,633][model3_sft.py][INFO] Epoch:[0/2](15100/63764) loss:3.416 lr:0.0000164 epoch_Time:298.0min: [2023-12-11 09:20:02,994][model3_sft.py][INFO] Epoch:[0/2](15150/63764) loss:3.200 lr:0.0000164 epoch_Time:298.0min: [2023-12-11 09:20:21,334][model3_sft.py][INFO] Epoch:[0/2](15200/63764) loss:3.199 lr:0.0000163 epoch_Time:297.0min: [2023-12-11 09:20:39,647][model3_sft.py][INFO] Epoch:[0/2](15250/63764) loss:3.524 lr:0.0000163 epoch_Time:297.0min: [2023-12-11 09:20:57,999][model3_sft.py][INFO] Epoch:[0/2](15300/63764) loss:3.045 lr:0.0000163 epoch_Time:297.0min: [2023-12-11 09:21:16,334][model3_sft.py][INFO] Epoch:[0/2](15350/63764) loss:4.197 lr:0.0000163 epoch_Time:296.0min: [2023-12-11 09:21:34,649][model3_sft.py][INFO] Epoch:[0/2](15400/63764) loss:2.499 lr:0.0000162 epoch_Time:296.0min: [2023-12-11 09:21:53,008][model3_sft.py][INFO] Epoch:[0/2](15450/63764) loss:3.712 lr:0.0000162 epoch_Time:296.0min: [2023-12-11 09:22:11,560][model3_sft.py][INFO] Epoch:[0/2](15500/63764) loss:3.118 lr:0.0000162 epoch_Time:296.0min: [2023-12-11 09:22:29,933][model3_sft.py][INFO] Epoch:[0/2](15550/63764) loss:3.147 lr:0.0000162 epoch_Time:295.0min: [2023-12-11 09:22:48,301][model3_sft.py][INFO] Epoch:[0/2](15600/63764) loss:3.503 lr:0.0000161 epoch_Time:295.0min: [2023-12-11 09:23:06,623][model3_sft.py][INFO] Epoch:[0/2](15650/63764) loss:4.068 lr:0.0000161 epoch_Time:295.0min: [2023-12-11 09:23:24,977][model3_sft.py][INFO] Epoch:[0/2](15700/63764) loss:4.192 lr:0.0000161 epoch_Time:294.0min: [2023-12-11 09:23:43,323][model3_sft.py][INFO] Epoch:[0/2](15750/63764) loss:4.566 lr:0.0000161 epoch_Time:294.0min: [2023-12-11 09:24:01,713][model3_sft.py][INFO] Epoch:[0/2](15800/63764) loss:3.355 lr:0.0000160 epoch_Time:294.0min: [2023-12-11 09:24:20,087][model3_sft.py][INFO] Epoch:[0/2](15850/63764) loss:3.627 lr:0.0000160 epoch_Time:293.0min: [2023-12-11 09:24:38,433][model3_sft.py][INFO] Epoch:[0/2](15900/63764) loss:3.512 lr:0.0000160 epoch_Time:293.0min: [2023-12-11 09:24:56,787][model3_sft.py][INFO] Epoch:[0/2](15950/63764) loss:3.496 lr:0.0000160 epoch_Time:293.0min: [2023-12-11 09:25:15,104][model3_sft.py][INFO] Epoch:[0/2](16000/63764) loss:3.534 lr:0.0000159 epoch_Time:292.0min: [2023-12-11 09:25:33,455][model3_sft.py][INFO] Epoch:[0/2](16050/63764) loss:3.956 lr:0.0000159 epoch_Time:292.0min: [2023-12-11 09:25:51,784][model3_sft.py][INFO] Epoch:[0/2](16100/63764) loss:2.821 lr:0.0000159 epoch_Time:292.0min: [2023-12-11 09:26:10,110][model3_sft.py][INFO] Epoch:[0/2](16150/63764) loss:3.784 lr:0.0000159 epoch_Time:292.0min: [2023-12-11 09:26:28,440][model3_sft.py][INFO] Epoch:[0/2](16200/63764) loss:3.526 lr:0.0000158 epoch_Time:291.0min: [2023-12-11 09:26:46,784][model3_sft.py][INFO] Epoch:[0/2](16250/63764) loss:4.160 lr:0.0000158 epoch_Time:291.0min: [2023-12-11 09:27:05,133][model3_sft.py][INFO] Epoch:[0/2](16300/63764) loss:3.673 lr:0.0000158 epoch_Time:291.0min: [2023-12-11 09:27:23,471][model3_sft.py][INFO] Epoch:[0/2](16350/63764) loss:3.392 lr:0.0000158 epoch_Time:290.0min: [2023-12-11 09:27:41,791][model3_sft.py][INFO] Epoch:[0/2](16400/63764) loss:3.253 lr:0.0000157 epoch_Time:290.0min: [2023-12-11 09:28:00,340][model3_sft.py][INFO] Epoch:[0/2](16450/63764) loss:3.911 lr:0.0000157 epoch_Time:290.0min: [2023-12-11 09:28:18,648][model3_sft.py][INFO] Epoch:[0/2](16500/63764) loss:3.915 lr:0.0000157 epoch_Time:289.0min: [2023-12-11 09:28:36,997][model3_sft.py][INFO] Epoch:[0/2](16550/63764) loss:3.118 lr:0.0000157 epoch_Time:289.0min: [2023-12-11 09:28:55,330][model3_sft.py][INFO] Epoch:[0/2](16600/63764) loss:3.982 lr:0.0000156 epoch_Time:289.0min: [2023-12-11 09:29:13,665][model3_sft.py][INFO] Epoch:[0/2](16650/63764) loss:3.968 lr:0.0000156 epoch_Time:288.0min: [2023-12-11 09:29:32,000][model3_sft.py][INFO] Epoch:[0/2](16700/63764) loss:3.267 lr:0.0000156 epoch_Time:288.0min: [2023-12-11 09:29:50,359][model3_sft.py][INFO] Epoch:[0/2](16750/63764) loss:3.661 lr:0.0000156 epoch_Time:288.0min: [2023-12-11 09:30:08,697][model3_sft.py][INFO] Epoch:[0/2](16800/63764) loss:3.857 lr:0.0000155 epoch_Time:288.0min: [2023-12-11 09:30:27,015][model3_sft.py][INFO] Epoch:[0/2](16850/63764) loss:3.948 lr:0.0000155 epoch_Time:287.0min: [2023-12-11 09:30:45,365][model3_sft.py][INFO] Epoch:[0/2](16900/63764) loss:3.766 lr:0.0000155 epoch_Time:287.0min: [2023-12-11 09:31:03,684][model3_sft.py][INFO] Epoch:[0/2](16950/63764) loss:2.947 lr:0.0000155 epoch_Time:287.0min: [2023-12-11 09:31:22,004][model3_sft.py][INFO] Epoch:[0/2](17000/63764) loss:4.554 lr:0.0000154 epoch_Time:286.0min: [2023-12-11 09:31:40,312][model3_sft.py][INFO] Epoch:[0/2](17050/63764) loss:3.438 lr:0.0000154 epoch_Time:286.0min: [2023-12-11 09:31:58,630][model3_sft.py][INFO] Epoch:[0/2](17100/63764) loss:3.767 lr:0.0000154 epoch_Time:286.0min: [2023-12-11 09:32:16,981][model3_sft.py][INFO] Epoch:[0/2](17150/63764) loss:3.927 lr:0.0000153 epoch_Time:285.0min: [2023-12-11 09:32:35,321][model3_sft.py][INFO] Epoch:[0/2](17200/63764) loss:3.318 lr:0.0000153 epoch_Time:285.0min: [2023-12-11 09:32:53,687][model3_sft.py][INFO] Epoch:[0/2](17250/63764) loss:3.579 lr:0.0000153 epoch_Time:285.0min: [2023-12-11 09:33:12,084][model3_sft.py][INFO] Epoch:[0/2](17300/63764) loss:3.865 lr:0.0000153 epoch_Time:285.0min: [2023-12-11 09:33:30,428][model3_sft.py][INFO] Epoch:[0/2](17350/63764) loss:3.566 lr:0.0000152 epoch_Time:284.0min: [2023-12-11 09:33:48,858][model3_sft.py][INFO] Epoch:[0/2](17400/63764) loss:2.967 lr:0.0000152 epoch_Time:284.0min: [2023-12-11 09:34:07,453][model3_sft.py][INFO] Epoch:[0/2](17450/63764) loss:4.139 lr:0.0000152 epoch_Time:284.0min: [2023-12-11 09:34:25,818][model3_sft.py][INFO] Epoch:[0/2](17500/63764) loss:3.792 lr:0.0000152 epoch_Time:283.0min: [2023-12-11 09:34:44,212][model3_sft.py][INFO] Epoch:[0/2](17550/63764) loss:3.269 lr:0.0000151 epoch_Time:283.0min: [2023-12-11 09:35:02,607][model3_sft.py][INFO] Epoch:[0/2](17600/63764) loss:3.247 lr:0.0000151 epoch_Time:283.0min: [2023-12-11 09:35:20,998][model3_sft.py][INFO] Epoch:[0/2](17650/63764) loss:3.865 lr:0.0000151 epoch_Time:282.0min: [2023-12-11 09:35:39,379][model3_sft.py][INFO] Epoch:[0/2](17700/63764) loss:4.537 lr:0.0000151 epoch_Time:282.0min: [2023-12-11 09:35:57,779][model3_sft.py][INFO] Epoch:[0/2](17750/63764) loss:3.581 lr:0.0000150 epoch_Time:282.0min: [2023-12-11 09:36:16,163][model3_sft.py][INFO] Epoch:[0/2](17800/63764) loss:3.474 lr:0.0000150 epoch_Time:281.0min: [2023-12-11 09:36:34,562][model3_sft.py][INFO] Epoch:[0/2](17850/63764) loss:4.186 lr:0.0000150 epoch_Time:281.0min: [2023-12-11 09:36:52,921][model3_sft.py][INFO] Epoch:[0/2](17900/63764) loss:3.733 lr:0.0000149 epoch_Time:281.0min: [2023-12-11 09:37:11,299][model3_sft.py][INFO] Epoch:[0/2](17950/63764) loss:3.059 lr:0.0000149 epoch_Time:281.0min: [2023-12-11 09:37:29,651][model3_sft.py][INFO] Epoch:[0/2](18000/63764) loss:3.900 lr:0.0000149 epoch_Time:280.0min: [2023-12-11 09:37:48,052][model3_sft.py][INFO] Epoch:[0/2](18050/63764) loss:3.532 lr:0.0000149 epoch_Time:280.0min: [2023-12-11 09:38:06,426][model3_sft.py][INFO] Epoch:[0/2](18100/63764) loss:3.249 lr:0.0000148 epoch_Time:280.0min: [2023-12-11 09:38:24,833][model3_sft.py][INFO] Epoch:[0/2](18150/63764) loss:3.341 lr:0.0000148 epoch_Time:279.0min: [2023-12-11 09:38:43,216][model3_sft.py][INFO] Epoch:[0/2](18200/63764) loss:3.609 lr:0.0000148 epoch_Time:279.0min: [2023-12-11 09:39:01,609][model3_sft.py][INFO] Epoch:[0/2](18250/63764) loss:3.806 lr:0.0000148 epoch_Time:279.0min: [2023-12-11 09:39:20,015][model3_sft.py][INFO] Epoch:[0/2](18300/63764) loss:4.179 lr:0.0000147 epoch_Time:278.0min: [2023-12-11 09:39:38,429][model3_sft.py][INFO] Epoch:[0/2](18350/63764) loss:3.110 lr:0.0000147 epoch_Time:278.0min: [2023-12-11 09:39:57,065][model3_sft.py][INFO] Epoch:[0/2](18400/63764) loss:3.665 lr:0.0000147 epoch_Time:278.0min: [2023-12-11 09:40:15,455][model3_sft.py][INFO] Epoch:[0/2](18450/63764) loss:3.670 lr:0.0000146 epoch_Time:277.0min: [2023-12-11 09:40:33,817][model3_sft.py][INFO] Epoch:[0/2](18500/63764) loss:4.776 lr:0.0000146 epoch_Time:277.0min: [2023-12-11 09:40:52,211][model3_sft.py][INFO] Epoch:[0/2](18550/63764) loss:4.172 lr:0.0000146 epoch_Time:277.0min: [2023-12-11 09:41:10,598][model3_sft.py][INFO] Epoch:[0/2](18600/63764) loss:3.355 lr:0.0000146 epoch_Time:277.0min: [2023-12-11 09:41:28,966][model3_sft.py][INFO] Epoch:[0/2](18650/63764) loss:3.918 lr:0.0000145 epoch_Time:276.0min: [2023-12-11 09:41:47,367][model3_sft.py][INFO] Epoch:[0/2](18700/63764) loss:3.728 lr:0.0000145 epoch_Time:276.0min: [2023-12-11 09:42:05,779][model3_sft.py][INFO] Epoch:[0/2](18750/63764) loss:3.716 lr:0.0000145 epoch_Time:276.0min: [2023-12-11 09:42:24,170][model3_sft.py][INFO] Epoch:[0/2](18800/63764) loss:3.417 lr:0.0000145 epoch_Time:275.0min: [2023-12-11 09:42:42,568][model3_sft.py][INFO] Epoch:[0/2](18850/63764) loss:3.365 lr:0.0000144 epoch_Time:275.0min: [2023-12-11 09:43:00,960][model3_sft.py][INFO] Epoch:[0/2](18900/63764) loss:3.259 lr:0.0000144 epoch_Time:275.0min: [2023-12-11 09:43:19,344][model3_sft.py][INFO] Epoch:[0/2](18950/63764) loss:4.232 lr:0.0000144 epoch_Time:274.0min: [2023-12-11 09:43:37,714][model3_sft.py][INFO] Epoch:[0/2](19000/63764) loss:2.961 lr:0.0000143 epoch_Time:274.0min: [2023-12-11 09:43:56,085][model3_sft.py][INFO] Epoch:[0/2](19050/63764) loss:3.990 lr:0.0000143 epoch_Time:274.0min: [2023-12-11 09:44:14,482][model3_sft.py][INFO] Epoch:[0/2](19100/63764) loss:3.362 lr:0.0000143 epoch_Time:273.0min: [2023-12-11 09:44:32,863][model3_sft.py][INFO] Epoch:[0/2](19150/63764) loss:3.519 lr:0.0000143 epoch_Time:273.0min: [2023-12-11 09:44:51,253][model3_sft.py][INFO] Epoch:[0/2](19200/63764) loss:3.459 lr:0.0000142 epoch_Time:273.0min: [2023-12-11 09:45:09,651][model3_sft.py][INFO] Epoch:[0/2](19250/63764) loss:3.852 lr:0.0000142 epoch_Time:273.0min: [2023-12-11 09:45:28,049][model3_sft.py][INFO] Epoch:[0/2](19300/63764) loss:4.363 lr:0.0000142 epoch_Time:272.0min: [2023-12-11 09:45:46,671][model3_sft.py][INFO] Epoch:[0/2](19350/63764) loss:4.025 lr:0.0000141 epoch_Time:272.0min: [2023-12-11 09:46:05,090][model3_sft.py][INFO] Epoch:[0/2](19400/63764) loss:3.894 lr:0.0000141 epoch_Time:272.0min: [2023-12-11 09:46:23,468][model3_sft.py][INFO] Epoch:[0/2](19450/63764) loss:3.926 lr:0.0000141 epoch_Time:271.0min: [2023-12-11 09:46:41,868][model3_sft.py][INFO] Epoch:[0/2](19500/63764) loss:4.008 lr:0.0000141 epoch_Time:271.0min: [2023-12-11 09:47:00,278][model3_sft.py][INFO] Epoch:[0/2](19550/63764) loss:3.840 lr:0.0000140 epoch_Time:271.0min: [2023-12-11 09:47:18,653][model3_sft.py][INFO] Epoch:[0/2](19600/63764) loss:3.559 lr:0.0000140 epoch_Time:270.0min: [2023-12-11 09:47:37,047][model3_sft.py][INFO] Epoch:[0/2](19650/63764) loss:3.101 lr:0.0000140 epoch_Time:270.0min: [2023-12-11 09:47:55,447][model3_sft.py][INFO] Epoch:[0/2](19700/63764) loss:3.675 lr:0.0000140 epoch_Time:270.0min: [2023-12-11 09:48:13,824][model3_sft.py][INFO] Epoch:[0/2](19750/63764) loss:3.840 lr:0.0000139 epoch_Time:269.0min: [2023-12-11 09:48:32,241][model3_sft.py][INFO] Epoch:[0/2](19800/63764) loss:4.398 lr:0.0000139 epoch_Time:269.0min: [2023-12-11 09:48:50,634][model3_sft.py][INFO] Epoch:[0/2](19850/63764) loss:3.557 lr:0.0000139 epoch_Time:269.0min: [2023-12-11 09:49:09,015][model3_sft.py][INFO] Epoch:[0/2](19900/63764) loss:3.163 lr:0.0000138 epoch_Time:269.0min: [2023-12-11 09:49:27,397][model3_sft.py][INFO] Epoch:[0/2](19950/63764) loss:3.768 lr:0.0000138 epoch_Time:268.0min: [2023-12-11 09:49:45,776][model3_sft.py][INFO] Epoch:[0/2](20000/63764) loss:3.463 lr:0.0000138 epoch_Time:268.0min: [2023-12-11 09:50:04,173][model3_sft.py][INFO] Epoch:[0/2](20050/63764) loss:3.427 lr:0.0000138 epoch_Time:268.0min: [2023-12-11 09:50:22,577][model3_sft.py][INFO] Epoch:[0/2](20100/63764) loss:3.475 lr:0.0000137 epoch_Time:267.0min: [2023-12-11 09:50:40,948][model3_sft.py][INFO] Epoch:[0/2](20150/63764) loss:3.611 lr:0.0000137 epoch_Time:267.0min: [2023-12-11 09:50:59,346][model3_sft.py][INFO] Epoch:[0/2](20200/63764) loss:3.762 lr:0.0000137 epoch_Time:267.0min: [2023-12-11 09:51:17,762][model3_sft.py][INFO] Epoch:[0/2](20250/63764) loss:4.027 lr:0.0000136 epoch_Time:266.0min: [2023-12-11 09:51:36,375][model3_sft.py][INFO] Epoch:[0/2](20300/63764) loss:3.243 lr:0.0000136 epoch_Time:266.0min: [2023-12-11 09:51:54,796][model3_sft.py][INFO] Epoch:[0/2](20350/63764) loss:3.435 lr:0.0000136 epoch_Time:266.0min: [2023-12-11 09:52:13,201][model3_sft.py][INFO] Epoch:[0/2](20400/63764) loss:3.875 lr:0.0000136 epoch_Time:266.0min: [2023-12-11 09:52:31,569][model3_sft.py][INFO] Epoch:[0/2](20450/63764) loss:3.664 lr:0.0000135 epoch_Time:265.0min: [2023-12-11 09:52:49,987][model3_sft.py][INFO] Epoch:[0/2](20500/63764) loss:3.732 lr:0.0000135 epoch_Time:265.0min: [2023-12-11 09:53:08,376][model3_sft.py][INFO] Epoch:[0/2](20550/63764) loss:4.440 lr:0.0000135 epoch_Time:265.0min: [2023-12-11 09:53:26,781][model3_sft.py][INFO] Epoch:[0/2](20600/63764) loss:2.959 lr:0.0000134 epoch_Time:264.0min: [2023-12-11 09:53:45,174][model3_sft.py][INFO] Epoch:[0/2](20650/63764) loss:3.583 lr:0.0000134 epoch_Time:264.0min: [2023-12-11 09:54:03,580][model3_sft.py][INFO] Epoch:[0/2](20700/63764) loss:3.479 lr:0.0000134 epoch_Time:264.0min: [2023-12-11 09:54:22,008][model3_sft.py][INFO] Epoch:[0/2](20750/63764) loss:4.101 lr:0.0000133 epoch_Time:263.0min: [2023-12-11 09:54:40,471][model3_sft.py][INFO] Epoch:[0/2](20800/63764) loss:3.242 lr:0.0000133 epoch_Time:263.0min: [2023-12-11 09:54:58,947][model3_sft.py][INFO] Epoch:[0/2](20850/63764) loss:2.977 lr:0.0000133 epoch_Time:263.0min: [2023-12-11 09:55:17,367][model3_sft.py][INFO] Epoch:[0/2](20900/63764) loss:4.026 lr:0.0000133 epoch_Time:262.0min: [2023-12-11 09:55:35,838][model3_sft.py][INFO] Epoch:[0/2](20950/63764) loss:3.939 lr:0.0000132 epoch_Time:262.0min: [2023-12-11 09:55:54,258][model3_sft.py][INFO] Epoch:[0/2](21000/63764) loss:3.315 lr:0.0000132 epoch_Time:262.0min: [2023-12-11 09:56:12,675][model3_sft.py][INFO] Epoch:[0/2](21050/63764) loss:4.099 lr:0.0000132 epoch_Time:262.0min: [2023-12-11 09:56:31,091][model3_sft.py][INFO] Epoch:[0/2](21100/63764) loss:4.080 lr:0.0000131 epoch_Time:261.0min: [2023-12-11 09:56:49,519][model3_sft.py][INFO] Epoch:[0/2](21150/63764) loss:3.085 lr:0.0000131 epoch_Time:261.0min: [2023-12-11 09:57:07,937][model3_sft.py][INFO] Epoch:[0/2](21200/63764) loss:3.823 lr:0.0000131 epoch_Time:261.0min: [2023-12-11 09:57:26,552][model3_sft.py][INFO] Epoch:[0/2](21250/63764) loss:3.877 lr:0.0000131 epoch_Time:260.0min: [2023-12-11 09:57:45,014][model3_sft.py][INFO] Epoch:[0/2](21300/63764) loss:3.745 lr:0.0000130 epoch_Time:260.0min: [2023-12-11 09:58:03,423][model3_sft.py][INFO] Epoch:[0/2](21350/63764) loss:3.355 lr:0.0000130 epoch_Time:260.0min: [2023-12-11 09:58:21,839][model3_sft.py][INFO] Epoch:[0/2](21400/63764) loss:2.958 lr:0.0000130 epoch_Time:259.0min: [2023-12-11 09:58:40,313][model3_sft.py][INFO] Epoch:[0/2](21450/63764) loss:3.288 lr:0.0000129 epoch_Time:259.0min: [2023-12-11 09:58:58,749][model3_sft.py][INFO] Epoch:[0/2](21500/63764) loss:3.772 lr:0.0000129 epoch_Time:259.0min: [2023-12-11 09:59:17,208][model3_sft.py][INFO] Epoch:[0/2](21550/63764) loss:3.147 lr:0.0000129 epoch_Time:258.0min: [2023-12-11 09:59:35,594][model3_sft.py][INFO] Epoch:[0/2](21600/63764) loss:4.078 lr:0.0000129 epoch_Time:258.0min: [2023-12-11 09:59:54,041][model3_sft.py][INFO] Epoch:[0/2](21650/63764) loss:3.721 lr:0.0000128 epoch_Time:258.0min: [2023-12-11 10:00:12,448][model3_sft.py][INFO] Epoch:[0/2](21700/63764) loss:3.066 lr:0.0000128 epoch_Time:258.0min: [2023-12-11 10:00:30,801][model3_sft.py][INFO] Epoch:[0/2](21750/63764) loss:4.053 lr:0.0000128 epoch_Time:257.0min: [2023-12-11 10:00:49,189][model3_sft.py][INFO] Epoch:[0/2](21800/63764) loss:4.309 lr:0.0000127 epoch_Time:257.0min: [2023-12-11 10:01:07,574][model3_sft.py][INFO] Epoch:[0/2](21850/63764) loss:3.295 lr:0.0000127 epoch_Time:257.0min: [2023-12-11 10:01:25,990][model3_sft.py][INFO] Epoch:[0/2](21900/63764) loss:4.118 lr:0.0000127 epoch_Time:256.0min: [2023-12-11 10:01:44,389][model3_sft.py][INFO] Epoch:[0/2](21950/63764) loss:3.469 lr:0.0000126 epoch_Time:256.0min: [2023-12-11 10:02:02,774][model3_sft.py][INFO] Epoch:[0/2](22000/63764) loss:3.859 lr:0.0000126 epoch_Time:256.0min: [2023-12-11 10:02:21,139][model3_sft.py][INFO] Epoch:[0/2](22050/63764) loss:3.746 lr:0.0000126 epoch_Time:255.0min: [2023-12-11 10:02:39,561][model3_sft.py][INFO] Epoch:[0/2](22100/63764) loss:3.961 lr:0.0000126 epoch_Time:255.0min: [2023-12-11 10:02:57,955][model3_sft.py][INFO] Epoch:[0/2](22150/63764) loss:3.517 lr:0.0000125 epoch_Time:255.0min: [2023-12-11 10:03:16,352][model3_sft.py][INFO] Epoch:[0/2](22200/63764) loss:3.672 lr:0.0000125 epoch_Time:254.0min: [2023-12-11 10:03:34,957][model3_sft.py][INFO] Epoch:[0/2](22250/63764) loss:2.691 lr:0.0000125 epoch_Time:254.0min: [2023-12-11 10:03:53,350][model3_sft.py][INFO] Epoch:[0/2](22300/63764) loss:3.277 lr:0.0000124 epoch_Time:254.0min: [2023-12-11 10:04:11,732][model3_sft.py][INFO] Epoch:[0/2](22350/63764) loss:3.286 lr:0.0000124 epoch_Time:254.0min: [2023-12-11 10:04:30,109][model3_sft.py][INFO] Epoch:[0/2](22400/63764) loss:3.152 lr:0.0000124 epoch_Time:253.0min: [2023-12-11 10:04:48,514][model3_sft.py][INFO] Epoch:[0/2](22450/63764) loss:3.605 lr:0.0000123 epoch_Time:253.0min: [2023-12-11 10:05:06,913][model3_sft.py][INFO] Epoch:[0/2](22500/63764) loss:3.818 lr:0.0000123 epoch_Time:253.0min: [2023-12-11 10:05:25,347][model3_sft.py][INFO] Epoch:[0/2](22550/63764) loss:3.172 lr:0.0000123 epoch_Time:252.0min: [2023-12-11 10:05:43,760][model3_sft.py][INFO] Epoch:[0/2](22600/63764) loss:3.756 lr:0.0000123 epoch_Time:252.0min: [2023-12-11 10:06:02,160][model3_sft.py][INFO] Epoch:[0/2](22650/63764) loss:3.531 lr:0.0000122 epoch_Time:252.0min: [2023-12-11 10:06:20,543][model3_sft.py][INFO] Epoch:[0/2](22700/63764) loss:3.236 lr:0.0000122 epoch_Time:251.0min: [2023-12-11 10:06:38,922][model3_sft.py][INFO] Epoch:[0/2](22750/63764) loss:3.449 lr:0.0000122 epoch_Time:251.0min: [2023-12-11 10:06:57,323][model3_sft.py][INFO] Epoch:[0/2](22800/63764) loss:3.780 lr:0.0000121 epoch_Time:251.0min: [2023-12-11 10:07:15,779][model3_sft.py][INFO] Epoch:[0/2](22850/63764) loss:3.395 lr:0.0000121 epoch_Time:250.0min: [2023-12-11 10:07:34,149][model3_sft.py][INFO] Epoch:[0/2](22900/63764) loss:3.021 lr:0.0000121 epoch_Time:250.0min: [2023-12-11 10:07:52,551][model3_sft.py][INFO] Epoch:[0/2](22950/63764) loss:3.968 lr:0.0000120 epoch_Time:250.0min: [2023-12-11 10:08:10,926][model3_sft.py][INFO] Epoch:[0/2](23000/63764) loss:2.963 lr:0.0000120 epoch_Time:250.0min: [2023-12-11 10:08:29,338][model3_sft.py][INFO] Epoch:[0/2](23050/63764) loss:4.049 lr:0.0000120 epoch_Time:249.0min: [2023-12-11 10:08:47,726][model3_sft.py][INFO] Epoch:[0/2](23100/63764) loss:3.129 lr:0.0000120 epoch_Time:249.0min: [2023-12-11 10:09:06,153][model3_sft.py][INFO] Epoch:[0/2](23150/63764) loss:2.916 lr:0.0000119 epoch_Time:249.0min: [2023-12-11 10:09:24,757][model3_sft.py][INFO] Epoch:[0/2](23200/63764) loss:3.365 lr:0.0000119 epoch_Time:248.0min: [2023-12-11 10:09:43,201][model3_sft.py][INFO] Epoch:[0/2](23250/63764) loss:2.938 lr:0.0000119 epoch_Time:248.0min: [2023-12-11 10:10:01,619][model3_sft.py][INFO] Epoch:[0/2](23300/63764) loss:4.438 lr:0.0000118 epoch_Time:248.0min: [2023-12-11 10:10:20,041][model3_sft.py][INFO] Epoch:[0/2](23350/63764) loss:3.250 lr:0.0000118 epoch_Time:247.0min: [2023-12-11 10:10:38,453][model3_sft.py][INFO] Epoch:[0/2](23400/63764) loss:4.155 lr:0.0000118 epoch_Time:247.0min: [2023-12-11 10:10:56,854][model3_sft.py][INFO] Epoch:[0/2](23450/63764) loss:3.202 lr:0.0000117 epoch_Time:247.0min: [2023-12-11 10:11:15,236][model3_sft.py][INFO] Epoch:[0/2](23500/63764) loss:3.315 lr:0.0000117 epoch_Time:246.0min: [2023-12-11 10:11:33,643][model3_sft.py][INFO] Epoch:[0/2](23550/63764) loss:3.089 lr:0.0000117 epoch_Time:246.0min: [2023-12-11 10:11:52,095][model3_sft.py][INFO] Epoch:[0/2](23600/63764) loss:4.994 lr:0.0000117 epoch_Time:246.0min: [2023-12-11 10:12:10,510][model3_sft.py][INFO] Epoch:[0/2](23650/63764) loss:3.139 lr:0.0000116 epoch_Time:246.0min: [2023-12-11 10:12:28,932][model3_sft.py][INFO] Epoch:[0/2](23700/63764) loss:3.366 lr:0.0000116 epoch_Time:245.0min: [2023-12-11 10:12:47,347][model3_sft.py][INFO] Epoch:[0/2](23750/63764) loss:3.146 lr:0.0000116 epoch_Time:245.0min: [2023-12-11 10:13:05,786][model3_sft.py][INFO] Epoch:[0/2](23800/63764) loss:3.578 lr:0.0000115 epoch_Time:245.0min: [2023-12-11 10:13:24,209][model3_sft.py][INFO] Epoch:[0/2](23850/63764) loss:3.540 lr:0.0000115 epoch_Time:244.0min: [2023-12-11 10:13:42,626][model3_sft.py][INFO] Epoch:[0/2](23900/63764) loss:3.528 lr:0.0000115 epoch_Time:244.0min: [2023-12-11 10:14:01,008][model3_sft.py][INFO] Epoch:[0/2](23950/63764) loss:3.582 lr:0.0000114 epoch_Time:244.0min: [2023-12-11 10:14:19,446][model3_sft.py][INFO] Epoch:[0/2](24000/63764) loss:3.142 lr:0.0000114 epoch_Time:243.0min: [2023-12-11 10:14:37,796][model3_sft.py][INFO] Epoch:[0/2](24050/63764) loss:3.384 lr:0.0000114 epoch_Time:243.0min: [2023-12-11 10:14:56,156][model3_sft.py][INFO] Epoch:[0/2](24100/63764) loss:2.826 lr:0.0000114 epoch_Time:243.0min: [2023-12-11 10:15:14,817][model3_sft.py][INFO] Epoch:[0/2](24150/63764) loss:3.014 lr:0.0000113 epoch_Time:242.0min: [2023-12-11 10:15:33,218][model3_sft.py][INFO] Epoch:[0/2](24200/63764) loss:3.610 lr:0.0000113 epoch_Time:242.0min: [2023-12-11 10:15:51,610][model3_sft.py][INFO] Epoch:[0/2](24250/63764) loss:3.007 lr:0.0000113 epoch_Time:242.0min: [2023-12-11 10:16:10,023][model3_sft.py][INFO] Epoch:[0/2](24300/63764) loss:3.530 lr:0.0000112 epoch_Time:242.0min: [2023-12-11 10:16:28,457][model3_sft.py][INFO] Epoch:[0/2](24350/63764) loss:3.040 lr:0.0000112 epoch_Time:241.0min: [2023-12-11 10:16:46,909][model3_sft.py][INFO] Epoch:[0/2](24400/63764) loss:4.181 lr:0.0000112 epoch_Time:241.0min: [2023-12-11 10:17:05,318][model3_sft.py][INFO] Epoch:[0/2](24450/63764) loss:3.623 lr:0.0000111 epoch_Time:241.0min: [2023-12-11 10:17:23,713][model3_sft.py][INFO] Epoch:[0/2](24500/63764) loss:3.656 lr:0.0000111 epoch_Time:240.0min: [2023-12-11 10:17:42,110][model3_sft.py][INFO] Epoch:[0/2](24550/63764) loss:3.761 lr:0.0000111 epoch_Time:240.0min: [2023-12-11 10:18:00,514][model3_sft.py][INFO] Epoch:[0/2](24600/63764) loss:3.735 lr:0.0000110 epoch_Time:240.0min: [2023-12-11 10:18:18,871][model3_sft.py][INFO] Epoch:[0/2](24650/63764) loss:3.505 lr:0.0000110 epoch_Time:239.0min: [2023-12-11 10:18:37,235][model3_sft.py][INFO] Epoch:[0/2](24700/63764) loss:2.783 lr:0.0000110 epoch_Time:239.0min: [2023-12-11 10:18:55,598][model3_sft.py][INFO] Epoch:[0/2](24750/63764) loss:3.859 lr:0.0000110 epoch_Time:239.0min: [2023-12-11 10:19:13,981][model3_sft.py][INFO] Epoch:[0/2](24800/63764) loss:3.922 lr:0.0000109 epoch_Time:238.0min: [2023-12-11 10:19:32,370][model3_sft.py][INFO] Epoch:[0/2](24850/63764) loss:3.305 lr:0.0000109 epoch_Time:238.0min: [2023-12-11 10:19:50,768][model3_sft.py][INFO] Epoch:[0/2](24900/63764) loss:3.488 lr:0.0000109 epoch_Time:238.0min: [2023-12-11 10:20:09,177][model3_sft.py][INFO] Epoch:[0/2](24950/63764) loss:3.809 lr:0.0000108 epoch_Time:238.0min: [2023-12-11 10:20:27,581][model3_sft.py][INFO] Epoch:[0/2](25000/63764) loss:3.941 lr:0.0000108 epoch_Time:237.0min: [2023-12-11 10:20:46,003][model3_sft.py][INFO] Epoch:[0/2](25050/63764) loss:3.539 lr:0.0000108 epoch_Time:237.0min: [2023-12-11 10:21:04,623][model3_sft.py][INFO] Epoch:[0/2](25100/63764) loss:4.040 lr:0.0000107 epoch_Time:237.0min: [2023-12-11 10:21:23,023][model3_sft.py][INFO] Epoch:[0/2](25150/63764) loss:3.875 lr:0.0000107 epoch_Time:236.0min: [2023-12-11 10:21:41,393][model3_sft.py][INFO] Epoch:[0/2](25200/63764) loss:3.349 lr:0.0000107 epoch_Time:236.0min: [2023-12-11 10:21:59,805][model3_sft.py][INFO] Epoch:[0/2](25250/63764) loss:4.023 lr:0.0000107 epoch_Time:236.0min: [2023-12-11 10:22:18,203][model3_sft.py][INFO] Epoch:[0/2](25300/63764) loss:3.773 lr:0.0000106 epoch_Time:235.0min: [2023-12-11 10:22:36,605][model3_sft.py][INFO] Epoch:[0/2](25350/63764) loss:3.624 lr:0.0000106 epoch_Time:235.0min: [2023-12-11 10:22:54,986][model3_sft.py][INFO] Epoch:[0/2](25400/63764) loss:2.870 lr:0.0000106 epoch_Time:235.0min: [2023-12-11 10:23:13,382][model3_sft.py][INFO] Epoch:[0/2](25450/63764) loss:3.312 lr:0.0000105 epoch_Time:234.0min: [2023-12-11 10:23:31,768][model3_sft.py][INFO] Epoch:[0/2](25500/63764) loss:3.694 lr:0.0000105 epoch_Time:234.0min: [2023-12-11 10:23:50,156][model3_sft.py][INFO] Epoch:[0/2](25550/63764) loss:3.202 lr:0.0000105 epoch_Time:234.0min: [2023-12-11 10:24:08,543][model3_sft.py][INFO] Epoch:[0/2](25600/63764) loss:3.400 lr:0.0000104 epoch_Time:234.0min: [2023-12-11 10:24:26,964][model3_sft.py][INFO] Epoch:[0/2](25650/63764) loss:3.677 lr:0.0000104 epoch_Time:233.0min: [2023-12-11 10:24:45,388][model3_sft.py][INFO] Epoch:[0/2](25700/63764) loss:2.731 lr:0.0000104 epoch_Time:233.0min: [2023-12-11 10:25:03,820][model3_sft.py][INFO] Epoch:[0/2](25750/63764) loss:3.569 lr:0.0000103 epoch_Time:233.0min: [2023-12-11 10:25:22,244][model3_sft.py][INFO] Epoch:[0/2](25800/63764) loss:3.522 lr:0.0000103 epoch_Time:232.0min: [2023-12-11 10:25:40,717][model3_sft.py][INFO] Epoch:[0/2](25850/63764) loss:3.360 lr:0.0000103 epoch_Time:232.0min: [2023-12-11 10:25:59,173][model3_sft.py][INFO] Epoch:[0/2](25900/63764) loss:3.714 lr:0.0000103 epoch_Time:232.0min: [2023-12-11 10:26:17,591][model3_sft.py][INFO] Epoch:[0/2](25950/63764) loss:3.787 lr:0.0000102 epoch_Time:231.0min: [2023-12-11 10:26:36,025][model3_sft.py][INFO] Epoch:[0/2](26000/63764) loss:3.390 lr:0.0000102 epoch_Time:231.0min: [2023-12-11 10:26:54,744][model3_sft.py][INFO] Epoch:[0/2](26050/63764) loss:3.052 lr:0.0000102 epoch_Time:231.0min: [2023-12-11 10:27:13,198][model3_sft.py][INFO] Epoch:[0/2](26100/63764) loss:3.579 lr:0.0000101 epoch_Time:231.0min: [2023-12-11 10:27:31,618][model3_sft.py][INFO] Epoch:[0/2](26150/63764) loss:3.200 lr:0.0000101 epoch_Time:230.0min: [2023-12-11 10:27:50,084][model3_sft.py][INFO] Epoch:[0/2](26200/63764) loss:3.353 lr:0.0000101 epoch_Time:230.0min: [2023-12-11 10:28:08,536][model3_sft.py][INFO] Epoch:[0/2](26250/63764) loss:3.392 lr:0.0000100 epoch_Time:230.0min: [2023-12-11 10:28:26,950][model3_sft.py][INFO] Epoch:[0/2](26300/63764) loss:4.749 lr:0.0000100 epoch_Time:229.0min: [2023-12-11 10:28:45,358][model3_sft.py][INFO] Epoch:[0/2](26350/63764) loss:3.628 lr:0.0000100 epoch_Time:229.0min: [2023-12-11 10:29:03,777][model3_sft.py][INFO] Epoch:[0/2](26400/63764) loss:3.279 lr:0.0000100 epoch_Time:229.0min: [2023-12-11 10:29:22,205][model3_sft.py][INFO] Epoch:[0/2](26450/63764) loss:3.349 lr:0.0000099 epoch_Time:228.0min: [2023-12-11 10:29:40,652][model3_sft.py][INFO] Epoch:[0/2](26500/63764) loss:3.662 lr:0.0000099 epoch_Time:228.0min: [2023-12-11 10:29:59,063][model3_sft.py][INFO] Epoch:[0/2](26550/63764) loss:3.731 lr:0.0000099 epoch_Time:228.0min: [2023-12-11 10:30:17,580][model3_sft.py][INFO] Epoch:[0/2](26600/63764) loss:3.062 lr:0.0000098 epoch_Time:227.0min: [2023-12-11 10:30:36,015][model3_sft.py][INFO] Epoch:[0/2](26650/63764) loss:3.570 lr:0.0000098 epoch_Time:227.0min: [2023-12-11 10:30:54,421][model3_sft.py][INFO] Epoch:[0/2](26700/63764) loss:2.902 lr:0.0000098 epoch_Time:227.0min: [2023-12-11 10:31:12,851][model3_sft.py][INFO] Epoch:[0/2](26750/63764) loss:3.958 lr:0.0000097 epoch_Time:227.0min: [2023-12-11 10:31:31,269][model3_sft.py][INFO] Epoch:[0/2](26800/63764) loss:3.457 lr:0.0000097 epoch_Time:226.0min: [2023-12-11 10:31:49,667][model3_sft.py][INFO] Epoch:[0/2](26850/63764) loss:3.419 lr:0.0000097 epoch_Time:226.0min: [2023-12-11 10:32:08,099][model3_sft.py][INFO] Epoch:[0/2](26900/63764) loss:2.952 lr:0.0000096 epoch_Time:226.0min: [2023-12-11 10:32:26,511][model3_sft.py][INFO] Epoch:[0/2](26950/63764) loss:2.943 lr:0.0000096 epoch_Time:225.0min: [2023-12-11 10:32:44,914][model3_sft.py][INFO] Epoch:[0/2](27000/63764) loss:4.342 lr:0.0000096 epoch_Time:225.0min: [2023-12-11 10:33:03,533][model3_sft.py][INFO] Epoch:[0/2](27050/63764) loss:3.064 lr:0.0000096 epoch_Time:225.0min: [2023-12-11 10:33:21,896][model3_sft.py][INFO] Epoch:[0/2](27100/63764) loss:3.358 lr:0.0000095 epoch_Time:224.0min: [2023-12-11 10:33:40,285][model3_sft.py][INFO] Epoch:[0/2](27150/63764) loss:3.146 lr:0.0000095 epoch_Time:224.0min: [2023-12-11 10:33:58,620][model3_sft.py][INFO] Epoch:[0/2](27200/63764) loss:4.116 lr:0.0000095 epoch_Time:224.0min: [2023-12-11 10:34:16,982][model3_sft.py][INFO] Epoch:[0/2](27250/63764) loss:3.193 lr:0.0000094 epoch_Time:223.0min: [2023-12-11 10:34:35,352][model3_sft.py][INFO] Epoch:[0/2](27300/63764) loss:3.455 lr:0.0000094 epoch_Time:223.0min: [2023-12-11 10:34:53,748][model3_sft.py][INFO] Epoch:[0/2](27350/63764) loss:3.246 lr:0.0000094 epoch_Time:223.0min: [2023-12-11 10:35:12,119][model3_sft.py][INFO] Epoch:[0/2](27400/63764) loss:2.966 lr:0.0000093 epoch_Time:223.0min: [2023-12-11 10:35:30,478][model3_sft.py][INFO] Epoch:[0/2](27450/63764) loss:3.895 lr:0.0000093 epoch_Time:222.0min: [2023-12-11 10:35:48,836][model3_sft.py][INFO] Epoch:[0/2](27500/63764) loss:3.547 lr:0.0000093 epoch_Time:222.0min: [2023-12-11 10:36:07,230][model3_sft.py][INFO] Epoch:[0/2](27550/63764) loss:3.498 lr:0.0000093 epoch_Time:222.0min: [2023-12-11 10:36:25,579][model3_sft.py][INFO] Epoch:[0/2](27600/63764) loss:3.362 lr:0.0000092 epoch_Time:221.0min: [2023-12-11 10:36:43,959][model3_sft.py][INFO] Epoch:[0/2](27650/63764) loss:4.166 lr:0.0000092 epoch_Time:221.0min: [2023-12-11 10:37:02,303][model3_sft.py][INFO] Epoch:[0/2](27700/63764) loss:3.931 lr:0.0000092 epoch_Time:221.0min: [2023-12-11 10:37:20,650][model3_sft.py][INFO] Epoch:[0/2](27750/63764) loss:3.465 lr:0.0000091 epoch_Time:220.0min: [2023-12-11 10:37:39,016][model3_sft.py][INFO] Epoch:[0/2](27800/63764) loss:3.504 lr:0.0000091 epoch_Time:220.0min: [2023-12-11 10:37:57,399][model3_sft.py][INFO] Epoch:[0/2](27850/63764) loss:3.604 lr:0.0000091 epoch_Time:220.0min: [2023-12-11 10:38:15,754][model3_sft.py][INFO] Epoch:[0/2](27900/63764) loss:3.975 lr:0.0000090 epoch_Time:219.0min: [2023-12-11 10:38:34,081][model3_sft.py][INFO] Epoch:[0/2](27950/63764) loss:3.833 lr:0.0000090 epoch_Time:219.0min: [2023-12-11 10:38:52,619][model3_sft.py][INFO] Epoch:[0/2](28000/63764) loss:3.006 lr:0.0000090 epoch_Time:219.0min: [2023-12-11 10:39:10,968][model3_sft.py][INFO] Epoch:[0/2](28050/63764) loss:3.854 lr:0.0000090 epoch_Time:219.0min: [2023-12-11 10:39:29,311][model3_sft.py][INFO] Epoch:[0/2](28100/63764) loss:3.155 lr:0.0000089 epoch_Time:218.0min: [2023-12-11 10:39:47,677][model3_sft.py][INFO] Epoch:[0/2](28150/63764) loss:4.005 lr:0.0000089 epoch_Time:218.0min: [2023-12-11 10:40:05,995][model3_sft.py][INFO] Epoch:[0/2](28200/63764) loss:3.105 lr:0.0000089 epoch_Time:218.0min: [2023-12-11 10:40:24,348][model3_sft.py][INFO] Epoch:[0/2](28250/63764) loss:3.406 lr:0.0000088 epoch_Time:217.0min: [2023-12-11 10:40:42,685][model3_sft.py][INFO] Epoch:[0/2](28300/63764) loss:3.634 lr:0.0000088 epoch_Time:217.0min: [2023-12-11 10:41:01,081][model3_sft.py][INFO] Epoch:[0/2](28350/63764) loss:3.338 lr:0.0000088 epoch_Time:217.0min: [2023-12-11 10:41:19,434][model3_sft.py][INFO] Epoch:[0/2](28400/63764) loss:3.673 lr:0.0000087 epoch_Time:216.0min: [2023-12-11 10:41:37,772][model3_sft.py][INFO] Epoch:[0/2](28450/63764) loss:3.195 lr:0.0000087 epoch_Time:216.0min: [2023-12-11 10:41:56,117][model3_sft.py][INFO] Epoch:[0/2](28500/63764) loss:3.631 lr:0.0000087 epoch_Time:216.0min: [2023-12-11 10:42:14,481][model3_sft.py][INFO] Epoch:[0/2](28550/63764) loss:3.666 lr:0.0000087 epoch_Time:215.0min: [2023-12-11 10:42:32,816][model3_sft.py][INFO] Epoch:[0/2](28600/63764) loss:3.362 lr:0.0000086 epoch_Time:215.0min: [2023-12-11 10:42:51,152][model3_sft.py][INFO] Epoch:[0/2](28650/63764) loss:3.024 lr:0.0000086 epoch_Time:215.0min: [2023-12-11 10:43:09,529][model3_sft.py][INFO] Epoch:[0/2](28700/63764) loss:3.396 lr:0.0000086 epoch_Time:215.0min: [2023-12-11 10:43:27,855][model3_sft.py][INFO] Epoch:[0/2](28750/63764) loss:3.190 lr:0.0000085 epoch_Time:214.0min: [2023-12-11 10:43:46,197][model3_sft.py][INFO] Epoch:[0/2](28800/63764) loss:2.750 lr:0.0000085 epoch_Time:214.0min: [2023-12-11 10:44:04,565][model3_sft.py][INFO] Epoch:[0/2](28850/63764) loss:2.868 lr:0.0000085 epoch_Time:214.0min: [2023-12-11 10:44:22,948][model3_sft.py][INFO] Epoch:[0/2](28900/63764) loss:2.912 lr:0.0000084 epoch_Time:213.0min: [2023-12-11 10:44:41,537][model3_sft.py][INFO] Epoch:[0/2](28950/63764) loss:4.316 lr:0.0000084 epoch_Time:213.0min: [2023-12-11 10:44:59,884][model3_sft.py][INFO] Epoch:[0/2](29000/63764) loss:3.671 lr:0.0000084 epoch_Time:213.0min: [2023-12-11 10:45:18,263][model3_sft.py][INFO] Epoch:[0/2](29050/63764) loss:3.453 lr:0.0000084 epoch_Time:212.0min: [2023-12-11 10:45:36,651][model3_sft.py][INFO] Epoch:[0/2](29100/63764) loss:3.820 lr:0.0000083 epoch_Time:212.0min: [2023-12-11 10:45:55,038][model3_sft.py][INFO] Epoch:[0/2](29150/63764) loss:3.289 lr:0.0000083 epoch_Time:212.0min: [2023-12-11 10:46:13,387][model3_sft.py][INFO] Epoch:[0/2](29200/63764) loss:3.611 lr:0.0000083 epoch_Time:211.0min: [2023-12-11 10:46:31,840][model3_sft.py][INFO] Epoch:[0/2](29250/63764) loss:2.890 lr:0.0000082 epoch_Time:211.0min: [2023-12-11 10:46:50,237][model3_sft.py][INFO] Epoch:[0/2](29300/63764) loss:3.320 lr:0.0000082 epoch_Time:211.0min: [2023-12-11 10:47:08,612][model3_sft.py][INFO] Epoch:[0/2](29350/63764) loss:3.700 lr:0.0000082 epoch_Time:211.0min: [2023-12-11 10:47:26,961][model3_sft.py][INFO] Epoch:[0/2](29400/63764) loss:3.493 lr:0.0000081 epoch_Time:210.0min: [2023-12-11 10:47:45,364][model3_sft.py][INFO] Epoch:[0/2](29450/63764) loss:3.446 lr:0.0000081 epoch_Time:210.0min: [2023-12-11 10:48:03,728][model3_sft.py][INFO] Epoch:[0/2](29500/63764) loss:2.888 lr:0.0000081 epoch_Time:210.0min: [2023-12-11 10:48:22,092][model3_sft.py][INFO] Epoch:[0/2](29550/63764) loss:3.188 lr:0.0000081 epoch_Time:209.0min: [2023-12-11 10:48:40,442][model3_sft.py][INFO] Epoch:[0/2](29600/63764) loss:3.507 lr:0.0000080 epoch_Time:209.0min: [2023-12-11 10:48:58,814][model3_sft.py][INFO] Epoch:[0/2](29650/63764) loss:3.571 lr:0.0000080 epoch_Time:209.0min: [2023-12-11 10:49:17,190][model3_sft.py][INFO] Epoch:[0/2](29700/63764) loss:3.670 lr:0.0000080 epoch_Time:208.0min: [2023-12-11 10:49:35,549][model3_sft.py][INFO] Epoch:[0/2](29750/63764) loss:3.358 lr:0.0000079 epoch_Time:208.0min: [2023-12-11 10:49:53,902][model3_sft.py][INFO] Epoch:[0/2](29800/63764) loss:2.869 lr:0.0000079 epoch_Time:208.0min: [2023-12-11 10:50:12,325][model3_sft.py][INFO] Epoch:[0/2](29850/63764) loss:3.188 lr:0.0000079 epoch_Time:208.0min: [2023-12-11 10:50:30,960][model3_sft.py][INFO] Epoch:[0/2](29900/63764) loss:4.017 lr:0.0000079 epoch_Time:207.0min: [2023-12-11 10:50:49,333][model3_sft.py][INFO] Epoch:[0/2](29950/63764) loss:3.294 lr:0.0000078 epoch_Time:207.0min: [2023-12-11 10:51:07,783][model3_sft.py][INFO] Epoch:[0/2](30000/63764) loss:3.911 lr:0.0000078 epoch_Time:207.0min: [2023-12-11 10:51:26,205][model3_sft.py][INFO] Epoch:[0/2](30050/63764) loss:3.814 lr:0.0000078 epoch_Time:206.0min: [2023-12-11 10:51:44,608][model3_sft.py][INFO] Epoch:[0/2](30100/63764) loss:3.522 lr:0.0000077 epoch_Time:206.0min: [2023-12-11 10:52:02,993][model3_sft.py][INFO] Epoch:[0/2](30150/63764) loss:3.564 lr:0.0000077 epoch_Time:206.0min: [2023-12-11 10:52:21,411][model3_sft.py][INFO] Epoch:[0/2](30200/63764) loss:3.097 lr:0.0000077 epoch_Time:205.0min: [2023-12-11 10:52:39,861][model3_sft.py][INFO] Epoch:[0/2](30250/63764) loss:3.227 lr:0.0000077 epoch_Time:205.0min: [2023-12-11 10:52:58,245][model3_sft.py][INFO] Epoch:[0/2](30300/63764) loss:3.347 lr:0.0000076 epoch_Time:205.0min: [2023-12-11 10:53:16,643][model3_sft.py][INFO] Epoch:[0/2](30350/63764) loss:3.345 lr:0.0000076 epoch_Time:204.0min: [2023-12-11 10:53:35,089][model3_sft.py][INFO] Epoch:[0/2](30400/63764) loss:3.159 lr:0.0000076 epoch_Time:204.0min: [2023-12-11 10:53:53,481][model3_sft.py][INFO] Epoch:[0/2](30450/63764) loss:3.928 lr:0.0000075 epoch_Time:204.0min: [2023-12-11 10:54:11,962][model3_sft.py][INFO] Epoch:[0/2](30500/63764) loss:3.547 lr:0.0000075 epoch_Time:204.0min: [2023-12-11 10:54:30,372][model3_sft.py][INFO] Epoch:[0/2](30550/63764) loss:3.278 lr:0.0000075 epoch_Time:203.0min: [2023-12-11 10:54:48,788][model3_sft.py][INFO] Epoch:[0/2](30600/63764) loss:3.264 lr:0.0000074 epoch_Time:203.0min: [2023-12-11 10:55:07,235][model3_sft.py][INFO] Epoch:[0/2](30650/63764) loss:3.400 lr:0.0000074 epoch_Time:203.0min: [2023-12-11 10:55:25,630][model3_sft.py][INFO] Epoch:[0/2](30700/63764) loss:3.152 lr:0.0000074 epoch_Time:202.0min: [2023-12-11 10:55:44,047][model3_sft.py][INFO] Epoch:[0/2](30750/63764) loss:3.282 lr:0.0000074 epoch_Time:202.0min: [2023-12-11 10:56:02,464][model3_sft.py][INFO] Epoch:[0/2](30800/63764) loss:3.623 lr:0.0000073 epoch_Time:202.0min: [2023-12-11 10:56:21,138][model3_sft.py][INFO] Epoch:[0/2](30850/63764) loss:3.513 lr:0.0000073 epoch_Time:201.0min: [2023-12-11 10:56:39,566][model3_sft.py][INFO] Epoch:[0/2](30900/63764) loss:3.675 lr:0.0000073 epoch_Time:201.0min: [2023-12-11 10:56:57,985][model3_sft.py][INFO] Epoch:[0/2](30950/63764) loss:3.348 lr:0.0000072 epoch_Time:201.0min: [2023-12-11 10:57:16,433][model3_sft.py][INFO] Epoch:[0/2](31000/63764) loss:3.150 lr:0.0000072 epoch_Time:200.0min: [2023-12-11 10:57:34,839][model3_sft.py][INFO] Epoch:[0/2](31050/63764) loss:3.467 lr:0.0000072 epoch_Time:200.0min: [2023-12-11 10:57:53,263][model3_sft.py][INFO] Epoch:[0/2](31100/63764) loss:3.137 lr:0.0000072 epoch_Time:200.0min: [2023-12-11 10:58:11,773][model3_sft.py][INFO] Epoch:[0/2](31150/63764) loss:3.614 lr:0.0000071 epoch_Time:200.0min: [2023-12-11 10:58:30,210][model3_sft.py][INFO] Epoch:[0/2](31200/63764) loss:3.734 lr:0.0000071 epoch_Time:199.0min: [2023-12-11 10:58:48,612][model3_sft.py][INFO] Epoch:[0/2](31250/63764) loss:4.018 lr:0.0000071 epoch_Time:199.0min: [2023-12-11 10:59:07,032][model3_sft.py][INFO] Epoch:[0/2](31300/63764) loss:3.312 lr:0.0000070 epoch_Time:199.0min: [2023-12-11 10:59:25,460][model3_sft.py][INFO] Epoch:[0/2](31350/63764) loss:3.850 lr:0.0000070 epoch_Time:198.0min: [2023-12-11 10:59:43,875][model3_sft.py][INFO] Epoch:[0/2](31400/63764) loss:3.125 lr:0.0000070 epoch_Time:198.0min: [2023-12-11 11:00:02,258][model3_sft.py][INFO] Epoch:[0/2](31450/63764) loss:3.017 lr:0.0000070 epoch_Time:198.0min: [2023-12-11 11:00:20,686][model3_sft.py][INFO] Epoch:[0/2](31500/63764) loss:3.452 lr:0.0000069 epoch_Time:197.0min: [2023-12-11 11:00:39,113][model3_sft.py][INFO] Epoch:[0/2](31550/63764) loss:3.173 lr:0.0000069 epoch_Time:197.0min: [2023-12-11 11:00:57,469][model3_sft.py][INFO] Epoch:[0/2](31600/63764) loss:4.005 lr:0.0000069 epoch_Time:197.0min: [2023-12-11 11:01:15,878][model3_sft.py][INFO] Epoch:[0/2](31650/63764) loss:3.802 lr:0.0000069 epoch_Time:196.0min: [2023-12-11 11:01:34,276][model3_sft.py][INFO] Epoch:[0/2](31700/63764) loss:2.846 lr:0.0000068 epoch_Time:196.0min: [2023-12-11 11:01:52,729][model3_sft.py][INFO] Epoch:[0/2](31750/63764) loss:3.179 lr:0.0000068 epoch_Time:196.0min: [2023-12-11 11:02:11,160][model3_sft.py][INFO] Epoch:[0/2](31800/63764) loss:3.371 lr:0.0000068 epoch_Time:196.0min: [2023-12-11 11:02:29,761][model3_sft.py][INFO] Epoch:[0/2](31850/63764) loss:3.666 lr:0.0000067 epoch_Time:195.0min: [2023-12-11 11:02:48,160][model3_sft.py][INFO] Epoch:[0/2](31900/63764) loss:3.442 lr:0.0000067 epoch_Time:195.0min: [2023-12-11 11:03:06,557][model3_sft.py][INFO] Epoch:[0/2](31950/63764) loss:3.524 lr:0.0000067 epoch_Time:195.0min: [2023-12-11 11:03:24,934][model3_sft.py][INFO] Epoch:[0/2](32000/63764) loss:3.278 lr:0.0000067 epoch_Time:194.0min: [2023-12-11 11:03:43,337][model3_sft.py][INFO] Epoch:[0/2](32050/63764) loss:2.915 lr:0.0000066 epoch_Time:194.0min: [2023-12-11 11:04:01,714][model3_sft.py][INFO] Epoch:[0/2](32100/63764) loss:4.015 lr:0.0000066 epoch_Time:194.0min: [2023-12-11 11:04:20,089][model3_sft.py][INFO] Epoch:[0/2](32150/63764) loss:3.112 lr:0.0000066 epoch_Time:193.0min: [2023-12-11 11:04:38,434][model3_sft.py][INFO] Epoch:[0/2](32200/63764) loss:3.100 lr:0.0000065 epoch_Time:193.0min: [2023-12-11 11:04:56,831][model3_sft.py][INFO] Epoch:[0/2](32250/63764) loss:3.155 lr:0.0000065 epoch_Time:193.0min: [2023-12-11 11:05:15,202][model3_sft.py][INFO] Epoch:[0/2](32300/63764) loss:3.234 lr:0.0000065 epoch_Time:192.0min: [2023-12-11 11:05:33,566][model3_sft.py][INFO] Epoch:[0/2](32350/63764) loss:3.536 lr:0.0000065 epoch_Time:192.0min: [2023-12-11 11:05:51,923][model3_sft.py][INFO] Epoch:[0/2](32400/63764) loss:3.545 lr:0.0000064 epoch_Time:192.0min: [2023-12-11 11:06:10,302][model3_sft.py][INFO] Epoch:[0/2](32450/63764) loss:3.661 lr:0.0000064 epoch_Time:192.0min: [2023-12-11 11:06:28,662][model3_sft.py][INFO] Epoch:[0/2](32500/63764) loss:3.509 lr:0.0000064 epoch_Time:191.0min: [2023-12-11 11:06:47,008][model3_sft.py][INFO] Epoch:[0/2](32550/63764) loss:3.070 lr:0.0000064 epoch_Time:191.0min: [2023-12-11 11:07:05,360][model3_sft.py][INFO] Epoch:[0/2](32600/63764) loss:3.219 lr:0.0000063 epoch_Time:191.0min: [2023-12-11 11:07:23,762][model3_sft.py][INFO] Epoch:[0/2](32650/63764) loss:3.355 lr:0.0000063 epoch_Time:190.0min: [2023-12-11 11:07:42,173][model3_sft.py][INFO] Epoch:[0/2](32700/63764) loss:4.027 lr:0.0000063 epoch_Time:190.0min: [2023-12-11 11:08:00,563][model3_sft.py][INFO] Epoch:[0/2](32750/63764) loss:2.818 lr:0.0000062 epoch_Time:190.0min: [2023-12-11 11:08:19,177][model3_sft.py][INFO] Epoch:[0/2](32800/63764) loss:3.375 lr:0.0000062 epoch_Time:189.0min: [2023-12-11 11:08:37,551][model3_sft.py][INFO] Epoch:[0/2](32850/63764) loss:3.552 lr:0.0000062 epoch_Time:189.0min: [2023-12-11 11:08:55,919][model3_sft.py][INFO] Epoch:[0/2](32900/63764) loss:2.817 lr:0.0000062 epoch_Time:189.0min: [2023-12-11 11:09:14,270][model3_sft.py][INFO] Epoch:[0/2](32950/63764) loss:3.691 lr:0.0000061 epoch_Time:188.0min: [2023-12-11 11:09:32,634][model3_sft.py][INFO] Epoch:[0/2](33000/63764) loss:3.827 lr:0.0000061 epoch_Time:188.0min: [2023-12-11 11:09:50,962][model3_sft.py][INFO] Epoch:[0/2](33050/63764) loss:3.162 lr:0.0000061 epoch_Time:188.0min: [2023-12-11 11:10:09,358][model3_sft.py][INFO] Epoch:[0/2](33100/63764) loss:2.979 lr:0.0000061 epoch_Time:188.0min: [2023-12-11 11:10:27,710][model3_sft.py][INFO] Epoch:[0/2](33150/63764) loss:3.894 lr:0.0000060 epoch_Time:187.0min: [2023-12-11 11:10:46,069][model3_sft.py][INFO] Epoch:[0/2](33200/63764) loss:2.953 lr:0.0000060 epoch_Time:187.0min: [2023-12-11 11:11:04,452][model3_sft.py][INFO] Epoch:[0/2](33250/63764) loss:4.379 lr:0.0000060 epoch_Time:187.0min: [2023-12-11 11:11:22,818][model3_sft.py][INFO] Epoch:[0/2](33300/63764) loss:3.740 lr:0.0000059 epoch_Time:186.0min: [2023-12-11 11:11:41,213][model3_sft.py][INFO] Epoch:[0/2](33350/63764) loss:3.273 lr:0.0000059 epoch_Time:186.0min: [2023-12-11 11:11:59,617][model3_sft.py][INFO] Epoch:[0/2](33400/63764) loss:3.521 lr:0.0000059 epoch_Time:186.0min: [2023-12-11 11:12:18,005][model3_sft.py][INFO] Epoch:[0/2](33450/63764) loss:4.109 lr:0.0000059 epoch_Time:185.0min: [2023-12-11 11:12:36,397][model3_sft.py][INFO] Epoch:[0/2](33500/63764) loss:3.680 lr:0.0000058 epoch_Time:185.0min: [2023-12-11 11:12:54,770][model3_sft.py][INFO] Epoch:[0/2](33550/63764) loss:3.755 lr:0.0000058 epoch_Time:185.0min: [2023-12-11 11:13:13,154][model3_sft.py][INFO] Epoch:[0/2](33600/63764) loss:3.434 lr:0.0000058 epoch_Time:185.0min: [2023-12-11 11:13:31,552][model3_sft.py][INFO] Epoch:[0/2](33650/63764) loss:3.300 lr:0.0000058 epoch_Time:184.0min: [2023-12-11 11:13:49,943][model3_sft.py][INFO] Epoch:[0/2](33700/63764) loss:3.602 lr:0.0000057 epoch_Time:184.0min: [2023-12-11 11:14:08,564][model3_sft.py][INFO] Epoch:[0/2](33750/63764) loss:3.729 lr:0.0000057 epoch_Time:184.0min: [2023-12-11 11:14:26,929][model3_sft.py][INFO] Epoch:[0/2](33800/63764) loss:3.820 lr:0.0000057 epoch_Time:183.0min: [2023-12-11 11:14:45,277][model3_sft.py][INFO] Epoch:[0/2](33850/63764) loss:3.299 lr:0.0000057 epoch_Time:183.0min: [2023-12-11 11:15:03,638][model3_sft.py][INFO] Epoch:[0/2](33900/63764) loss:3.087 lr:0.0000056 epoch_Time:183.0min: [2023-12-11 11:15:21,998][model3_sft.py][INFO] Epoch:[0/2](33950/63764) loss:3.522 lr:0.0000056 epoch_Time:182.0min: [2023-12-11 11:15:40,346][model3_sft.py][INFO] Epoch:[0/2](34000/63764) loss:3.752 lr:0.0000056 epoch_Time:182.0min: [2023-12-11 11:15:58,737][model3_sft.py][INFO] Epoch:[0/2](34050/63764) loss:2.712 lr:0.0000055 epoch_Time:182.0min: [2023-12-11 11:16:17,090][model3_sft.py][INFO] Epoch:[0/2](34100/63764) loss:3.406 lr:0.0000055 epoch_Time:181.0min: [2023-12-11 11:16:35,479][model3_sft.py][INFO] Epoch:[0/2](34150/63764) loss:3.469 lr:0.0000055 epoch_Time:181.0min: [2023-12-11 11:16:53,868][model3_sft.py][INFO] Epoch:[0/2](34200/63764) loss:3.618 lr:0.0000055 epoch_Time:181.0min: [2023-12-11 11:17:12,226][model3_sft.py][INFO] Epoch:[0/2](34250/63764) loss:3.906 lr:0.0000054 epoch_Time:181.0min: [2023-12-11 11:17:30,645][model3_sft.py][INFO] Epoch:[0/2](34300/63764) loss:3.749 lr:0.0000054 epoch_Time:180.0min: [2023-12-11 11:17:49,016][model3_sft.py][INFO] Epoch:[0/2](34350/63764) loss:3.240 lr:0.0000054 epoch_Time:180.0min: [2023-12-11 11:18:07,408][model3_sft.py][INFO] Epoch:[0/2](34400/63764) loss:3.900 lr:0.0000054 epoch_Time:180.0min: [2023-12-11 11:18:25,771][model3_sft.py][INFO] Epoch:[0/2](34450/63764) loss:3.260 lr:0.0000053 epoch_Time:179.0min: [2023-12-11 11:18:44,221][model3_sft.py][INFO] Epoch:[0/2](34500/63764) loss:3.537 lr:0.0000053 epoch_Time:179.0min: [2023-12-11 11:19:02,593][model3_sft.py][INFO] Epoch:[0/2](34550/63764) loss:3.204 lr:0.0000053 epoch_Time:179.0min: [2023-12-11 11:19:20,972][model3_sft.py][INFO] Epoch:[0/2](34600/63764) loss:3.518 lr:0.0000053 epoch_Time:178.0min: [2023-12-11 11:19:39,352][model3_sft.py][INFO] Epoch:[0/2](34650/63764) loss:3.068 lr:0.0000052 epoch_Time:178.0min: [2023-12-11 11:19:57,947][model3_sft.py][INFO] Epoch:[0/2](34700/63764) loss:3.988 lr:0.0000052 epoch_Time:178.0min: [2023-12-11 11:20:16,321][model3_sft.py][INFO] Epoch:[0/2](34750/63764) loss:3.443 lr:0.0000052 epoch_Time:177.0min: [2023-12-11 11:20:34,643][model3_sft.py][INFO] Epoch:[0/2](34800/63764) loss:3.342 lr:0.0000052 epoch_Time:177.0min: [2023-12-11 11:20:53,000][model3_sft.py][INFO] Epoch:[0/2](34850/63764) loss:3.288 lr:0.0000051 epoch_Time:177.0min: [2023-12-11 11:21:11,345][model3_sft.py][INFO] Epoch:[0/2](34900/63764) loss:2.791 lr:0.0000051 epoch_Time:177.0min: [2023-12-11 11:21:29,693][model3_sft.py][INFO] Epoch:[0/2](34950/63764) loss:3.379 lr:0.0000051 epoch_Time:176.0min: [2023-12-11 11:21:48,046][model3_sft.py][INFO] Epoch:[0/2](35000/63764) loss:3.228 lr:0.0000051 epoch_Time:176.0min: [2023-12-11 11:22:06,429][model3_sft.py][INFO] Epoch:[0/2](35050/63764) loss:2.997 lr:0.0000050 epoch_Time:176.0min: [2023-12-11 11:22:24,802][model3_sft.py][INFO] Epoch:[0/2](35100/63764) loss:3.242 lr:0.0000050 epoch_Time:175.0min: [2023-12-11 11:22:43,162][model3_sft.py][INFO] Epoch:[0/2](35150/63764) loss:3.140 lr:0.0000050 epoch_Time:175.0min: [2023-12-11 11:23:01,499][model3_sft.py][INFO] Epoch:[0/2](35200/63764) loss:3.625 lr:0.0000050 epoch_Time:175.0min: [2023-12-11 11:23:19,879][model3_sft.py][INFO] Epoch:[0/2](35250/63764) loss:3.578 lr:0.0000049 epoch_Time:174.0min: [2023-12-11 11:23:38,281][model3_sft.py][INFO] Epoch:[0/2](35300/63764) loss:3.692 lr:0.0000049 epoch_Time:174.0min: [2023-12-11 11:23:56,616][model3_sft.py][INFO] Epoch:[0/2](35350/63764) loss:3.597 lr:0.0000049 epoch_Time:174.0min: [2023-12-11 11:24:14,974][model3_sft.py][INFO] Epoch:[0/2](35400/63764) loss:3.152 lr:0.0000049 epoch_Time:173.0min: [2023-12-11 11:24:33,365][model3_sft.py][INFO] Epoch:[0/2](35450/63764) loss:3.253 lr:0.0000048 epoch_Time:173.0min: [2023-12-11 11:24:51,732][model3_sft.py][INFO] Epoch:[0/2](35500/63764) loss:3.416 lr:0.0000048 epoch_Time:173.0min: [2023-12-11 11:25:10,128][model3_sft.py][INFO] Epoch:[0/2](35550/63764) loss:3.314 lr:0.0000048 epoch_Time:173.0min: [2023-12-11 11:25:28,516][model3_sft.py][INFO] Epoch:[0/2](35600/63764) loss:3.672 lr:0.0000048 epoch_Time:172.0min: [2023-12-11 11:25:47,144][model3_sft.py][INFO] Epoch:[0/2](35650/63764) loss:3.340 lr:0.0000047 epoch_Time:172.0min: [2023-12-11 11:26:05,529][model3_sft.py][INFO] Epoch:[0/2](35700/63764) loss:3.349 lr:0.0000047 epoch_Time:172.0min: [2023-12-11 11:26:23,900][model3_sft.py][INFO] Epoch:[0/2](35750/63764) loss:3.931 lr:0.0000047 epoch_Time:171.0min: [2023-12-11 11:26:42,289][model3_sft.py][INFO] Epoch:[0/2](35800/63764) loss:3.281 lr:0.0000047 epoch_Time:171.0min: [2023-12-11 11:27:00,675][model3_sft.py][INFO] Epoch:[0/2](35850/63764) loss:3.202 lr:0.0000046 epoch_Time:171.0min: [2023-12-11 11:27:19,066][model3_sft.py][INFO] Epoch:[0/2](35900/63764) loss:3.519 lr:0.0000046 epoch_Time:170.0min: [2023-12-11 11:27:37,415][model3_sft.py][INFO] Epoch:[0/2](35950/63764) loss:3.675 lr:0.0000046 epoch_Time:170.0min: [2023-12-11 11:27:55,781][model3_sft.py][INFO] Epoch:[0/2](36000/63764) loss:3.064 lr:0.0000046 epoch_Time:170.0min: [2023-12-11 11:28:14,178][model3_sft.py][INFO] Epoch:[0/2](36050/63764) loss:3.535 lr:0.0000046 epoch_Time:169.0min: [2023-12-11 11:28:32,560][model3_sft.py][INFO] Epoch:[0/2](36100/63764) loss:3.774 lr:0.0000045 epoch_Time:169.0min: [2023-12-11 11:28:50,953][model3_sft.py][INFO] Epoch:[0/2](36150/63764) loss:3.242 lr:0.0000045 epoch_Time:169.0min: [2023-12-11 11:29:09,385][model3_sft.py][INFO] Epoch:[0/2](36200/63764) loss:3.138 lr:0.0000045 epoch_Time:169.0min: [2023-12-11 11:29:27,848][model3_sft.py][INFO] Epoch:[0/2](36250/63764) loss:3.292 lr:0.0000045 epoch_Time:168.0min: [2023-12-11 11:29:46,264][model3_sft.py][INFO] Epoch:[0/2](36300/63764) loss:3.225 lr:0.0000044 epoch_Time:168.0min: [2023-12-11 11:30:04,739][model3_sft.py][INFO] Epoch:[0/2](36350/63764) loss:4.062 lr:0.0000044 epoch_Time:168.0min: [2023-12-11 11:30:23,169][model3_sft.py][INFO] Epoch:[0/2](36400/63764) loss:3.494 lr:0.0000044 epoch_Time:167.0min: [2023-12-11 11:30:41,599][model3_sft.py][INFO] Epoch:[0/2](36450/63764) loss:3.192 lr:0.0000044 epoch_Time:167.0min: [2023-12-11 11:31:00,015][model3_sft.py][INFO] Epoch:[0/2](36500/63764) loss:3.387 lr:0.0000043 epoch_Time:167.0min: [2023-12-11 11:31:18,480][model3_sft.py][INFO] Epoch:[0/2](36550/63764) loss:3.018 lr:0.0000043 epoch_Time:166.0min: [2023-12-11 11:31:36,930][model3_sft.py][INFO] Epoch:[0/2](36600/63764) loss:3.932 lr:0.0000043 epoch_Time:166.0min: [2023-12-11 11:31:55,578][model3_sft.py][INFO] Epoch:[0/2](36650/63764) loss:2.709 lr:0.0000043 epoch_Time:166.0min: [2023-12-11 11:32:14,046][model3_sft.py][INFO] Epoch:[0/2](36700/63764) loss:3.396 lr:0.0000042 epoch_Time:165.0min: [2023-12-11 11:32:32,533][model3_sft.py][INFO] Epoch:[0/2](36750/63764) loss:3.411 lr:0.0000042 epoch_Time:165.0min: [2023-12-11 11:32:51,004][model3_sft.py][INFO] Epoch:[0/2](36800/63764) loss:3.443 lr:0.0000042 epoch_Time:165.0min: [2023-12-11 11:33:09,501][model3_sft.py][INFO] Epoch:[0/2](36850/63764) loss:3.422 lr:0.0000042 epoch_Time:165.0min: [2023-12-11 11:33:27,969][model3_sft.py][INFO] Epoch:[0/2](36900/63764) loss:2.856 lr:0.0000042 epoch_Time:164.0min: [2023-12-11 11:33:46,370][model3_sft.py][INFO] Epoch:[0/2](36950/63764) loss:3.387 lr:0.0000041 epoch_Time:164.0min: [2023-12-11 11:34:04,738][model3_sft.py][INFO] Epoch:[0/2](37000/63764) loss:3.577 lr:0.0000041 epoch_Time:164.0min: [2023-12-11 11:34:23,151][model3_sft.py][INFO] Epoch:[0/2](37050/63764) loss:2.671 lr:0.0000041 epoch_Time:163.0min: [2023-12-11 11:34:41,555][model3_sft.py][INFO] Epoch:[0/2](37100/63764) loss:3.356 lr:0.0000041 epoch_Time:163.0min: [2023-12-11 11:34:59,932][model3_sft.py][INFO] Epoch:[0/2](37150/63764) loss:3.991 lr:0.0000040 epoch_Time:163.0min: [2023-12-11 11:35:18,272][model3_sft.py][INFO] Epoch:[0/2](37200/63764) loss:4.286 lr:0.0000040 epoch_Time:162.0min: [2023-12-11 11:35:36,652][model3_sft.py][INFO] Epoch:[0/2](37250/63764) loss:3.590 lr:0.0000040 epoch_Time:162.0min: [2023-12-11 11:35:55,020][model3_sft.py][INFO] Epoch:[0/2](37300/63764) loss:3.049 lr:0.0000040 epoch_Time:162.0min: [2023-12-11 11:36:13,430][model3_sft.py][INFO] Epoch:[0/2](37350/63764) loss:3.136 lr:0.0000040 epoch_Time:161.0min: [2023-12-11 11:36:31,800][model3_sft.py][INFO] Epoch:[0/2](37400/63764) loss:2.978 lr:0.0000039 epoch_Time:161.0min: [2023-12-11 11:36:50,171][model3_sft.py][INFO] Epoch:[0/2](37450/63764) loss:3.454 lr:0.0000039 epoch_Time:161.0min: [2023-12-11 11:37:08,563][model3_sft.py][INFO] Epoch:[0/2](37500/63764) loss:3.943 lr:0.0000039 epoch_Time:161.0min: [2023-12-11 11:37:26,935][model3_sft.py][INFO] Epoch:[0/2](37550/63764) loss:4.430 lr:0.0000039 epoch_Time:160.0min: [2023-12-11 11:37:45,522][model3_sft.py][INFO] Epoch:[0/2](37600/63764) loss:3.386 lr:0.0000038 epoch_Time:160.0min: [2023-12-11 11:38:03,938][model3_sft.py][INFO] Epoch:[0/2](37650/63764) loss:3.678 lr:0.0000038 epoch_Time:160.0min: [2023-12-11 11:38:22,312][model3_sft.py][INFO] Epoch:[0/2](37700/63764) loss:3.388 lr:0.0000038 epoch_Time:159.0min: [2023-12-11 11:38:40,701][model3_sft.py][INFO] Epoch:[0/2](37750/63764) loss:3.610 lr:0.0000038 epoch_Time:159.0min: [2023-12-11 11:38:59,052][model3_sft.py][INFO] Epoch:[0/2](37800/63764) loss:4.143 lr:0.0000038 epoch_Time:159.0min: [2023-12-11 11:39:17,447][model3_sft.py][INFO] Epoch:[0/2](37850/63764) loss:3.433 lr:0.0000037 epoch_Time:158.0min: [2023-12-11 11:39:35,806][model3_sft.py][INFO] Epoch:[0/2](37900/63764) loss:2.880 lr:0.0000037 epoch_Time:158.0min: [2023-12-11 11:39:54,164][model3_sft.py][INFO] Epoch:[0/2](37950/63764) loss:3.371 lr:0.0000037 epoch_Time:158.0min: [2023-12-11 11:40:12,571][model3_sft.py][INFO] Epoch:[0/2](38000/63764) loss:3.163 lr:0.0000037 epoch_Time:158.0min: [2023-12-11 11:40:30,959][model3_sft.py][INFO] Epoch:[0/2](38050/63764) loss:3.678 lr:0.0000037 epoch_Time:157.0min: [2023-12-11 11:40:49,348][model3_sft.py][INFO] Epoch:[0/2](38100/63764) loss:2.578 lr:0.0000036 epoch_Time:157.0min: [2023-12-11 11:41:07,734][model3_sft.py][INFO] Epoch:[0/2](38150/63764) loss:3.188 lr:0.0000036 epoch_Time:157.0min: [2023-12-11 11:41:26,191][model3_sft.py][INFO] Epoch:[0/2](38200/63764) loss:3.414 lr:0.0000036 epoch_Time:156.0min: [2023-12-11 11:41:44,571][model3_sft.py][INFO] Epoch:[0/2](38250/63764) loss:3.964 lr:0.0000036 epoch_Time:156.0min: [2023-12-11 11:42:02,954][model3_sft.py][INFO] Epoch:[0/2](38300/63764) loss:2.893 lr:0.0000035 epoch_Time:156.0min: [2023-12-11 11:42:21,327][model3_sft.py][INFO] Epoch:[0/2](38350/63764) loss:3.890 lr:0.0000035 epoch_Time:155.0min: [2023-12-11 11:42:39,715][model3_sft.py][INFO] Epoch:[0/2](38400/63764) loss:3.494 lr:0.0000035 epoch_Time:155.0min: [2023-12-11 11:42:58,129][model3_sft.py][INFO] Epoch:[0/2](38450/63764) loss:3.854 lr:0.0000035 epoch_Time:155.0min: [2023-12-11 11:43:16,525][model3_sft.py][INFO] Epoch:[0/2](38500/63764) loss:2.479 lr:0.0000035 epoch_Time:154.0min: [2023-12-11 11:43:35,135][model3_sft.py][INFO] Epoch:[0/2](38550/63764) loss:3.872 lr:0.0000034 epoch_Time:154.0min: [2023-12-11 11:43:53,528][model3_sft.py][INFO] Epoch:[0/2](38600/63764) loss:3.294 lr:0.0000034 epoch_Time:154.0min: [2023-12-11 11:44:11,916][model3_sft.py][INFO] Epoch:[0/2](38650/63764) loss:3.102 lr:0.0000034 epoch_Time:154.0min: [2023-12-11 11:44:30,288][model3_sft.py][INFO] Epoch:[0/2](38700/63764) loss:3.232 lr:0.0000034 epoch_Time:153.0min: [2023-12-11 11:44:48,712][model3_sft.py][INFO] Epoch:[0/2](38750/63764) loss:2.441 lr:0.0000034 epoch_Time:153.0min: [2023-12-11 11:45:07,094][model3_sft.py][INFO] Epoch:[0/2](38800/63764) loss:3.155 lr:0.0000033 epoch_Time:153.0min: [2023-12-11 11:45:25,497][model3_sft.py][INFO] Epoch:[0/2](38850/63764) loss:3.631 lr:0.0000033 epoch_Time:152.0min: [2023-12-11 11:45:43,886][model3_sft.py][INFO] Epoch:[0/2](38900/63764) loss:2.715 lr:0.0000033 epoch_Time:152.0min: [2023-12-11 11:46:02,231][model3_sft.py][INFO] Epoch:[0/2](38950/63764) loss:3.971 lr:0.0000033 epoch_Time:152.0min: [2023-12-11 11:46:20,608][model3_sft.py][INFO] Epoch:[0/2](39000/63764) loss:3.480 lr:0.0000033 epoch_Time:151.0min: [2023-12-11 11:46:38,976][model3_sft.py][INFO] Epoch:[0/2](39050/63764) loss:2.913 lr:0.0000032 epoch_Time:151.0min: [2023-12-11 11:46:57,346][model3_sft.py][INFO] Epoch:[0/2](39100/63764) loss:3.069 lr:0.0000032 epoch_Time:151.0min: [2023-12-11 11:47:15,717][model3_sft.py][INFO] Epoch:[0/2](39150/63764) loss:2.960 lr:0.0000032 epoch_Time:150.0min: [2023-12-11 11:47:34,085][model3_sft.py][INFO] Epoch:[0/2](39200/63764) loss:3.588 lr:0.0000032 epoch_Time:150.0min: [2023-12-11 11:47:52,469][model3_sft.py][INFO] Epoch:[0/2](39250/63764) loss:3.804 lr:0.0000032 epoch_Time:150.0min: [2023-12-11 11:48:10,814][model3_sft.py][INFO] Epoch:[0/2](39300/63764) loss:3.180 lr:0.0000031 epoch_Time:150.0min: [2023-12-11 11:48:29,188][model3_sft.py][INFO] Epoch:[0/2](39350/63764) loss:3.071 lr:0.0000031 epoch_Time:149.0min: [2023-12-11 11:48:47,591][model3_sft.py][INFO] Epoch:[0/2](39400/63764) loss:3.245 lr:0.0000031 epoch_Time:149.0min: [2023-12-11 11:49:05,958][model3_sft.py][INFO] Epoch:[0/2](39450/63764) loss:3.474 lr:0.0000031 epoch_Time:149.0min: [2023-12-11 11:49:24,550][model3_sft.py][INFO] Epoch:[0/2](39500/63764) loss:3.596 lr:0.0000031 epoch_Time:148.0min: [2023-12-11 11:49:42,931][model3_sft.py][INFO] Epoch:[0/2](39550/63764) loss:3.519 lr:0.0000031 epoch_Time:148.0min: [2023-12-11 11:50:01,335][model3_sft.py][INFO] Epoch:[0/2](39600/63764) loss:3.893 lr:0.0000030 epoch_Time:148.0min: [2023-12-11 11:50:19,726][model3_sft.py][INFO] Epoch:[0/2](39650/63764) loss:3.646 lr:0.0000030 epoch_Time:147.0min: [2023-12-11 11:50:38,067][model3_sft.py][INFO] Epoch:[0/2](39700/63764) loss:3.325 lr:0.0000030 epoch_Time:147.0min: [2023-12-11 11:50:56,444][model3_sft.py][INFO] Epoch:[0/2](39750/63764) loss:3.317 lr:0.0000030 epoch_Time:147.0min: [2023-12-11 11:51:14,841][model3_sft.py][INFO] Epoch:[0/2](39800/63764) loss:4.194 lr:0.0000030 epoch_Time:146.0min: [2023-12-11 11:51:33,204][model3_sft.py][INFO] Epoch:[0/2](39850/63764) loss:3.583 lr:0.0000029 epoch_Time:146.0min: [2023-12-11 11:51:51,559][model3_sft.py][INFO] Epoch:[0/2](39900/63764) loss:2.584 lr:0.0000029 epoch_Time:146.0min: [2023-12-11 11:52:09,952][model3_sft.py][INFO] Epoch:[0/2](39950/63764) loss:3.285 lr:0.0000029 epoch_Time:146.0min: [2023-12-11 11:52:28,345][model3_sft.py][INFO] Epoch:[0/2](40000/63764) loss:3.404 lr:0.0000029 epoch_Time:145.0min: [2023-12-11 11:52:46,692][model3_sft.py][INFO] Epoch:[0/2](40050/63764) loss:2.828 lr:0.0000029 epoch_Time:145.0min: [2023-12-11 11:53:05,089][model3_sft.py][INFO] Epoch:[0/2](40100/63764) loss:3.210 lr:0.0000029 epoch_Time:145.0min: [2023-12-11 11:53:23,454][model3_sft.py][INFO] Epoch:[0/2](40150/63764) loss:3.663 lr:0.0000028 epoch_Time:144.0min: [2023-12-11 11:53:41,862][model3_sft.py][INFO] Epoch:[0/2](40200/63764) loss:3.024 lr:0.0000028 epoch_Time:144.0min: [2023-12-11 11:54:00,287][model3_sft.py][INFO] Epoch:[0/2](40250/63764) loss:3.532 lr:0.0000028 epoch_Time:144.0min: [2023-12-11 11:54:18,702][model3_sft.py][INFO] Epoch:[0/2](40300/63764) loss:3.091 lr:0.0000028 epoch_Time:143.0min: [2023-12-11 11:54:37,096][model3_sft.py][INFO] Epoch:[0/2](40350/63764) loss:3.522 lr:0.0000028 epoch_Time:143.0min: [2023-12-11 11:54:55,461][model3_sft.py][INFO] Epoch:[0/2](40400/63764) loss:3.675 lr:0.0000027 epoch_Time:143.0min: [2023-12-11 11:55:14,042][model3_sft.py][INFO] Epoch:[0/2](40450/63764) loss:3.940 lr:0.0000027 epoch_Time:142.0min: [2023-12-11 11:55:32,448][model3_sft.py][INFO] Epoch:[0/2](40500/63764) loss:2.819 lr:0.0000027 epoch_Time:142.0min: [2023-12-11 11:55:50,819][model3_sft.py][INFO] Epoch:[0/2](40550/63764) loss:3.854 lr:0.0000027 epoch_Time:142.0min: [2023-12-11 11:56:09,212][model3_sft.py][INFO] Epoch:[0/2](40600/63764) loss:3.452 lr:0.0000027 epoch_Time:142.0min: [2023-12-11 11:56:27,612][model3_sft.py][INFO] Epoch:[0/2](40650/63764) loss:3.225 lr:0.0000027 epoch_Time:141.0min: [2023-12-11 11:56:46,071][model3_sft.py][INFO] Epoch:[0/2](40700/63764) loss:2.660 lr:0.0000026 epoch_Time:141.0min: [2023-12-11 11:57:04,507][model3_sft.py][INFO] Epoch:[0/2](40750/63764) loss:3.379 lr:0.0000026 epoch_Time:141.0min: [2023-12-11 11:57:22,932][model3_sft.py][INFO] Epoch:[0/2](40800/63764) loss:2.324 lr:0.0000026 epoch_Time:140.0min: [2023-12-11 11:57:41,308][model3_sft.py][INFO] Epoch:[0/2](40850/63764) loss:3.324 lr:0.0000026 epoch_Time:140.0min: [2023-12-11 11:57:59,765][model3_sft.py][INFO] Epoch:[0/2](40900/63764) loss:3.683 lr:0.0000026 epoch_Time:140.0min: [2023-12-11 11:58:18,251][model3_sft.py][INFO] Epoch:[0/2](40950/63764) loss:3.337 lr:0.0000026 epoch_Time:139.0min: [2023-12-11 11:58:36,690][model3_sft.py][INFO] Epoch:[0/2](41000/63764) loss:3.085 lr:0.0000025 epoch_Time:139.0min: [2023-12-11 11:58:55,157][model3_sft.py][INFO] Epoch:[0/2](41050/63764) loss:3.451 lr:0.0000025 epoch_Time:139.0min: [2023-12-11 11:59:13,573][model3_sft.py][INFO] Epoch:[0/2](41100/63764) loss:3.773 lr:0.0000025 epoch_Time:138.0min: [2023-12-11 11:59:32,017][model3_sft.py][INFO] Epoch:[0/2](41150/63764) loss:3.209 lr:0.0000025 epoch_Time:138.0min: [2023-12-11 11:59:50,452][model3_sft.py][INFO] Epoch:[0/2](41200/63764) loss:3.271 lr:0.0000025 epoch_Time:138.0min: [2023-12-11 12:00:08,931][model3_sft.py][INFO] Epoch:[0/2](41250/63764) loss:3.277 lr:0.0000025 epoch_Time:138.0min: [2023-12-11 12:00:27,361][model3_sft.py][INFO] Epoch:[0/2](41300/63764) loss:3.286 lr:0.0000024 epoch_Time:137.0min: [2023-12-11 12:00:45,826][model3_sft.py][INFO] Epoch:[0/2](41350/63764) loss:4.019 lr:0.0000024 epoch_Time:137.0min: [2023-12-11 12:01:04,293][model3_sft.py][INFO] Epoch:[0/2](41400/63764) loss:3.668 lr:0.0000024 epoch_Time:137.0min: [2023-12-11 12:01:22,963][model3_sft.py][INFO] Epoch:[0/2](41450/63764) loss:3.802 lr:0.0000024 epoch_Time:136.0min: [2023-12-11 12:01:41,395][model3_sft.py][INFO] Epoch:[0/2](41500/63764) loss:3.226 lr:0.0000024 epoch_Time:136.0min: [2023-12-11 12:01:59,863][model3_sft.py][INFO] Epoch:[0/2](41550/63764) loss:3.870 lr:0.0000024 epoch_Time:136.0min: [2023-12-11 12:02:18,302][model3_sft.py][INFO] Epoch:[0/2](41600/63764) loss:3.554 lr:0.0000023 epoch_Time:135.0min: [2023-12-11 12:02:36,762][model3_sft.py][INFO] Epoch:[0/2](41650/63764) loss:3.741 lr:0.0000023 epoch_Time:135.0min: [2023-12-11 12:02:55,226][model3_sft.py][INFO] Epoch:[0/2](41700/63764) loss:3.185 lr:0.0000023 epoch_Time:135.0min: [2023-12-11 12:03:13,673][model3_sft.py][INFO] Epoch:[0/2](41750/63764) loss:3.373 lr:0.0000023 epoch_Time:134.0min: [2023-12-11 12:03:32,130][model3_sft.py][INFO] Epoch:[0/2](41800/63764) loss:3.094 lr:0.0000023 epoch_Time:134.0min: [2023-12-11 12:03:50,663][model3_sft.py][INFO] Epoch:[0/2](41850/63764) loss:2.550 lr:0.0000023 epoch_Time:134.0min: [2023-12-11 12:04:09,132][model3_sft.py][INFO] Epoch:[0/2](41900/63764) loss:3.600 lr:0.0000023 epoch_Time:134.0min: [2023-12-11 12:04:27,583][model3_sft.py][INFO] Epoch:[0/2](41950/63764) loss:3.237 lr:0.0000022 epoch_Time:133.0min: [2023-12-11 12:04:46,056][model3_sft.py][INFO] Epoch:[0/2](42000/63764) loss:3.741 lr:0.0000022 epoch_Time:133.0min: [2023-12-11 12:05:04,495][model3_sft.py][INFO] Epoch:[0/2](42050/63764) loss:3.438 lr:0.0000022 epoch_Time:133.0min: [2023-12-11 12:05:23,011][model3_sft.py][INFO] Epoch:[0/2](42100/63764) loss:3.031 lr:0.0000022 epoch_Time:133.0min: [2023-12-11 12:05:41,465][model3_sft.py][INFO] Epoch:[0/2](42150/63764) loss:3.457 lr:0.0000022 epoch_Time:133.0min: [2023-12-11 12:05:59,933][model3_sft.py][INFO] Epoch:[0/2](42200/63764) loss:2.960 lr:0.0000022 epoch_Time:133.0min: [2023-12-11 12:06:18,440][model3_sft.py][INFO] Epoch:[0/2](42250/63764) loss:3.761 lr:0.0000021 epoch_Time:132.0min: [2023-12-11 12:06:36,897][model3_sft.py][INFO] Epoch:[0/2](42300/63764) loss:3.242 lr:0.0000021 epoch_Time:132.0min: [2023-12-11 12:06:55,328][model3_sft.py][INFO] Epoch:[0/2](42350/63764) loss:3.772 lr:0.0000021 epoch_Time:132.0min: [2023-12-11 12:07:14,005][model3_sft.py][INFO] Epoch:[0/2](42400/63764) loss:2.746 lr:0.0000021 epoch_Time:131.0min: [2023-12-11 12:07:32,473][model3_sft.py][INFO] Epoch:[0/2](42450/63764) loss:3.550 lr:0.0000021 epoch_Time:131.0min: [2023-12-11 12:07:50,921][model3_sft.py][INFO] Epoch:[0/2](42500/63764) loss:2.786 lr:0.0000021 epoch_Time:131.0min: [2023-12-11 12:08:09,347][model3_sft.py][INFO] Epoch:[0/2](42550/63764) loss:3.366 lr:0.0000021 epoch_Time:131.0min: [2023-12-11 12:08:27,850][model3_sft.py][INFO] Epoch:[0/2](42600/63764) loss:3.068 lr:0.0000020 epoch_Time:130.0min: [2023-12-11 12:08:46,292][model3_sft.py][INFO] Epoch:[0/2](42650/63764) loss:3.522 lr:0.0000020 epoch_Time:130.0min: [2023-12-11 12:09:04,743][model3_sft.py][INFO] Epoch:[0/2](42700/63764) loss:3.421 lr:0.0000020 epoch_Time:130.0min: [2023-12-11 12:09:23,164][model3_sft.py][INFO] Epoch:[0/2](42750/63764) loss:3.312 lr:0.0000020 epoch_Time:129.0min: [2023-12-11 12:09:41,624][model3_sft.py][INFO] Epoch:[0/2](42800/63764) loss:3.265 lr:0.0000020 epoch_Time:129.0min: [2023-12-11 12:10:00,081][model3_sft.py][INFO] Epoch:[0/2](42850/63764) loss:3.048 lr:0.0000020 epoch_Time:129.0min: [2023-12-11 12:10:18,518][model3_sft.py][INFO] Epoch:[0/2](42900/63764) loss:3.300 lr:0.0000020 epoch_Time:128.0min: [2023-12-11 12:10:36,969][model3_sft.py][INFO] Epoch:[0/2](42950/63764) loss:2.539 lr:0.0000020 epoch_Time:128.0min: [2023-12-11 12:10:55,397][model3_sft.py][INFO] Epoch:[0/2](43000/63764) loss:2.836 lr:0.0000019 epoch_Time:128.0min: [2023-12-11 12:11:13,883][model3_sft.py][INFO] Epoch:[0/2](43050/63764) loss:3.575 lr:0.0000019 epoch_Time:127.0min: [2023-12-11 12:11:32,361][model3_sft.py][INFO] Epoch:[0/2](43100/63764) loss:3.540 lr:0.0000019 epoch_Time:127.0min: [2023-12-11 12:11:50,828][model3_sft.py][INFO] Epoch:[0/2](43150/63764) loss:3.229 lr:0.0000019 epoch_Time:127.0min: [2023-12-11 12:12:09,301][model3_sft.py][INFO] Epoch:[0/2](43200/63764) loss:3.466 lr:0.0000019 epoch_Time:127.0min: [2023-12-11 12:12:27,805][model3_sft.py][INFO] Epoch:[0/2](43250/63764) loss:3.202 lr:0.0000019 epoch_Time:126.0min: [2023-12-11 12:12:46,277][model3_sft.py][INFO] Epoch:[0/2](43300/63764) loss:3.187 lr:0.0000019 epoch_Time:126.0min: [2023-12-11 12:13:04,973][model3_sft.py][INFO] Epoch:[0/2](43350/63764) loss:3.469 lr:0.0000019 epoch_Time:126.0min: [2023-12-11 12:13:23,404][model3_sft.py][INFO] Epoch:[0/2](43400/63764) loss:3.233 lr:0.0000018 epoch_Time:125.0min: [2023-12-11 12:13:41,875][model3_sft.py][INFO] Epoch:[0/2](43450/63764) loss:2.924 lr:0.0000018 epoch_Time:125.0min: [2023-12-11 12:14:00,343][model3_sft.py][INFO] Epoch:[0/2](43500/63764) loss:3.609 lr:0.0000018 epoch_Time:125.0min: [2023-12-11 12:14:18,880][model3_sft.py][INFO] Epoch:[0/2](43550/63764) loss:3.022 lr:0.0000018 epoch_Time:124.0min: [2023-12-11 12:14:37,322][model3_sft.py][INFO] Epoch:[0/2](43600/63764) loss:3.401 lr:0.0000018 epoch_Time:124.0min: [2023-12-11 12:14:55,777][model3_sft.py][INFO] Epoch:[0/2](43650/63764) loss:3.142 lr:0.0000018 epoch_Time:124.0min: [2023-12-11 12:15:14,246][model3_sft.py][INFO] Epoch:[0/2](43700/63764) loss:2.945 lr:0.0000018 epoch_Time:123.0min: [2023-12-11 12:15:32,724][model3_sft.py][INFO] Epoch:[0/2](43750/63764) loss:3.644 lr:0.0000018 epoch_Time:123.0min: [2023-12-11 12:15:51,162][model3_sft.py][INFO] Epoch:[0/2](43800/63764) loss:3.247 lr:0.0000017 epoch_Time:123.0min: [2023-12-11 12:16:09,610][model3_sft.py][INFO] Epoch:[0/2](43850/63764) loss:3.438 lr:0.0000017 epoch_Time:123.0min: [2023-12-11 12:16:28,102][model3_sft.py][INFO] Epoch:[0/2](43900/63764) loss:3.236 lr:0.0000017 epoch_Time:122.0min: [2023-12-11 12:16:46,604][model3_sft.py][INFO] Epoch:[0/2](43950/63764) loss:3.745 lr:0.0000017 epoch_Time:122.0min: [2023-12-11 12:17:05,027][model3_sft.py][INFO] Epoch:[0/2](44000/63764) loss:3.876 lr:0.0000017 epoch_Time:122.0min: [2023-12-11 12:17:23,450][model3_sft.py][INFO] Epoch:[0/2](44050/63764) loss:3.425 lr:0.0000017 epoch_Time:121.0min: [2023-12-11 12:17:41,937][model3_sft.py][INFO] Epoch:[0/2](44100/63764) loss:3.463 lr:0.0000017 epoch_Time:121.0min: [2023-12-11 12:18:00,389][model3_sft.py][INFO] Epoch:[0/2](44150/63764) loss:3.081 lr:0.0000017 epoch_Time:121.0min: [2023-12-11 12:18:18,859][model3_sft.py][INFO] Epoch:[0/2](44200/63764) loss:3.399 lr:0.0000016 epoch_Time:120.0min: [2023-12-11 12:18:37,295][model3_sft.py][INFO] Epoch:[0/2](44250/63764) loss:3.064 lr:0.0000016 epoch_Time:120.0min: [2023-12-11 12:18:55,921][model3_sft.py][INFO] Epoch:[0/2](44300/63764) loss:3.783 lr:0.0000016 epoch_Time:120.0min: [2023-12-11 12:19:14,296][model3_sft.py][INFO] Epoch:[0/2](44350/63764) loss:3.552 lr:0.0000016 epoch_Time:119.0min: [2023-12-11 12:19:32,643][model3_sft.py][INFO] Epoch:[0/2](44400/63764) loss:3.170 lr:0.0000016 epoch_Time:119.0min: [2023-12-11 12:19:50,996][model3_sft.py][INFO] Epoch:[0/2](44450/63764) loss:3.633 lr:0.0000016 epoch_Time:119.0min: [2023-12-11 12:20:09,405][model3_sft.py][INFO] Epoch:[0/2](44500/63764) loss:3.204 lr:0.0000016 epoch_Time:119.0min: [2023-12-11 12:20:27,788][model3_sft.py][INFO] Epoch:[0/2](44550/63764) loss:3.907 lr:0.0000016 epoch_Time:118.0min: [2023-12-11 12:20:46,249][model3_sft.py][INFO] Epoch:[0/2](44600/63764) loss:2.915 lr:0.0000016 epoch_Time:118.0min: [2023-12-11 12:21:04,667][model3_sft.py][INFO] Epoch:[0/2](44650/63764) loss:3.480 lr:0.0000016 epoch_Time:118.0min: [2023-12-11 12:21:23,048][model3_sft.py][INFO] Epoch:[0/2](44700/63764) loss:4.205 lr:0.0000015 epoch_Time:117.0min: [2023-12-11 12:21:41,433][model3_sft.py][INFO] Epoch:[0/2](44750/63764) loss:3.137 lr:0.0000015 epoch_Time:117.0min: [2023-12-11 12:21:59,816][model3_sft.py][INFO] Epoch:[0/2](44800/63764) loss:3.380 lr:0.0000015 epoch_Time:117.0min: [2023-12-11 12:22:18,195][model3_sft.py][INFO] Epoch:[0/2](44850/63764) loss:3.552 lr:0.0000015 epoch_Time:116.0min: [2023-12-11 12:22:36,619][model3_sft.py][INFO] Epoch:[0/2](44900/63764) loss:3.421 lr:0.0000015 epoch_Time:116.0min: [2023-12-11 12:22:55,000][model3_sft.py][INFO] Epoch:[0/2](44950/63764) loss:3.477 lr:0.0000015 epoch_Time:116.0min: [2023-12-11 12:23:13,370][model3_sft.py][INFO] Epoch:[0/2](45000/63764) loss:3.697 lr:0.0000015 epoch_Time:115.0min: [2023-12-11 12:23:31,748][model3_sft.py][INFO] Epoch:[0/2](45050/63764) loss:3.194 lr:0.0000015 epoch_Time:115.0min: [2023-12-11 12:23:50,196][model3_sft.py][INFO] Epoch:[0/2](45100/63764) loss:3.879 lr:0.0000015 epoch_Time:115.0min: [2023-12-11 12:24:08,646][model3_sft.py][INFO] Epoch:[0/2](45150/63764) loss:3.543 lr:0.0000015 epoch_Time:115.0min: [2023-12-11 12:24:27,109][model3_sft.py][INFO] Epoch:[0/2](45200/63764) loss:3.442 lr:0.0000014 epoch_Time:114.0min: [2023-12-11 12:24:45,797][model3_sft.py][INFO] Epoch:[0/2](45250/63764) loss:4.126 lr:0.0000014 epoch_Time:114.0min: [2023-12-11 12:25:04,242][model3_sft.py][INFO] Epoch:[0/2](45300/63764) loss:3.777 lr:0.0000014 epoch_Time:114.0min: [2023-12-11 12:25:22,676][model3_sft.py][INFO] Epoch:[0/2](45350/63764) loss:3.187 lr:0.0000014 epoch_Time:113.0min: [2023-12-11 12:25:41,138][model3_sft.py][INFO] Epoch:[0/2](45400/63764) loss:2.966 lr:0.0000014 epoch_Time:113.0min: [2023-12-11 12:25:59,593][model3_sft.py][INFO] Epoch:[0/2](45450/63764) loss:2.731 lr:0.0000014 epoch_Time:113.0min: [2023-12-11 12:26:18,075][model3_sft.py][INFO] Epoch:[0/2](45500/63764) loss:3.382 lr:0.0000014 epoch_Time:112.0min: [2023-12-11 12:26:36,502][model3_sft.py][INFO] Epoch:[0/2](45550/63764) loss:2.883 lr:0.0000014 epoch_Time:112.0min: [2023-12-11 12:26:54,932][model3_sft.py][INFO] Epoch:[0/2](45600/63764) loss:3.403 lr:0.0000014 epoch_Time:112.0min: [2023-12-11 12:27:13,369][model3_sft.py][INFO] Epoch:[0/2](45650/63764) loss:2.973 lr:0.0000014 epoch_Time:111.0min: [2023-12-11 12:27:31,777][model3_sft.py][INFO] Epoch:[0/2](45700/63764) loss:3.585 lr:0.0000014 epoch_Time:111.0min: [2023-12-11 12:27:50,225][model3_sft.py][INFO] Epoch:[0/2](45750/63764) loss:3.159 lr:0.0000014 epoch_Time:111.0min: [2023-12-11 12:28:08,659][model3_sft.py][INFO] Epoch:[0/2](45800/63764) loss:4.038 lr:0.0000013 epoch_Time:111.0min: [2023-12-11 12:28:27,134][model3_sft.py][INFO] Epoch:[0/2](45850/63764) loss:2.993 lr:0.0000013 epoch_Time:110.0min: [2023-12-11 12:28:45,584][model3_sft.py][INFO] Epoch:[0/2](45900/63764) loss:3.102 lr:0.0000013 epoch_Time:110.0min: [2023-12-11 12:29:04,029][model3_sft.py][INFO] Epoch:[0/2](45950/63764) loss:3.654 lr:0.0000013 epoch_Time:110.0min: [2023-12-11 12:29:22,484][model3_sft.py][INFO] Epoch:[0/2](46000/63764) loss:3.673 lr:0.0000013 epoch_Time:109.0min: [2023-12-11 12:29:41,001][model3_sft.py][INFO] Epoch:[0/2](46050/63764) loss:3.483 lr:0.0000013 epoch_Time:109.0min: [2023-12-11 12:29:59,436][model3_sft.py][INFO] Epoch:[0/2](46100/63764) loss:2.740 lr:0.0000013 epoch_Time:109.0min: [2023-12-11 12:30:17,876][model3_sft.py][INFO] Epoch:[0/2](46150/63764) loss:3.353 lr:0.0000013 epoch_Time:108.0min: [2023-12-11 12:30:36,348][model3_sft.py][INFO] Epoch:[0/2](46200/63764) loss:3.002 lr:0.0000013 epoch_Time:108.0min: [2023-12-11 12:30:55,021][model3_sft.py][INFO] Epoch:[0/2](46250/63764) loss:3.094 lr:0.0000013 epoch_Time:108.0min: [2023-12-11 12:31:13,458][model3_sft.py][INFO] Epoch:[0/2](46300/63764) loss:3.286 lr:0.0000013 epoch_Time:107.0min: [2023-12-11 12:31:31,906][model3_sft.py][INFO] Epoch:[0/2](46350/63764) loss:3.438 lr:0.0000013 epoch_Time:107.0min: [2023-12-11 12:31:50,390][model3_sft.py][INFO] Epoch:[0/2](46400/63764) loss:3.835 lr:0.0000013 epoch_Time:107.0min: [2023-12-11 12:32:08,846][model3_sft.py][INFO] Epoch:[0/2](46450/63764) loss:3.301 lr:0.0000012 epoch_Time:107.0min: [2023-12-11 12:32:27,309][model3_sft.py][INFO] Epoch:[0/2](46500/63764) loss:2.509 lr:0.0000012 epoch_Time:106.0min: [2023-12-11 12:32:45,716][model3_sft.py][INFO] Epoch:[0/2](46550/63764) loss:2.972 lr:0.0000012 epoch_Time:106.0min: [2023-12-11 12:33:04,162][model3_sft.py][INFO] Epoch:[0/2](46600/63764) loss:3.685 lr:0.0000012 epoch_Time:106.0min: [2023-12-11 12:33:22,579][model3_sft.py][INFO] Epoch:[0/2](46650/63764) loss:2.872 lr:0.0000012 epoch_Time:105.0min: [2023-12-11 12:33:40,987][model3_sft.py][INFO] Epoch:[0/2](46700/63764) loss:3.258 lr:0.0000012 epoch_Time:105.0min: [2023-12-11 12:33:59,439][model3_sft.py][INFO] Epoch:[0/2](46750/63764) loss:3.357 lr:0.0000012 epoch_Time:105.0min: [2023-12-11 12:34:17,858][model3_sft.py][INFO] Epoch:[0/2](46800/63764) loss:3.529 lr:0.0000012 epoch_Time:104.0min: [2023-12-11 12:34:36,329][model3_sft.py][INFO] Epoch:[0/2](46850/63764) loss:3.813 lr:0.0000012 epoch_Time:104.0min: [2023-12-11 12:34:54,783][model3_sft.py][INFO] Epoch:[0/2](46900/63764) loss:2.680 lr:0.0000012 epoch_Time:104.0min: [2023-12-11 12:35:13,277][model3_sft.py][INFO] Epoch:[0/2](46950/63764) loss:4.156 lr:0.0000012 epoch_Time:103.0min: [2023-12-11 12:35:31,733][model3_sft.py][INFO] Epoch:[0/2](47000/63764) loss:3.185 lr:0.0000012 epoch_Time:103.0min: [2023-12-11 12:35:50,143][model3_sft.py][INFO] Epoch:[0/2](47050/63764) loss:3.680 lr:0.0000012 epoch_Time:103.0min: [2023-12-11 12:36:08,574][model3_sft.py][INFO] Epoch:[0/2](47100/63764) loss:3.428 lr:0.0000012 epoch_Time:103.0min: [2023-12-11 12:36:27,011][model3_sft.py][INFO] Epoch:[0/2](47150/63764) loss:2.594 lr:0.0000012 epoch_Time:102.0min: [2023-12-11 12:36:45,617][model3_sft.py][INFO] Epoch:[0/2](47200/63764) loss:3.154 lr:0.0000012 epoch_Time:102.0min: [2023-12-11 12:37:04,061][model3_sft.py][INFO] Epoch:[0/2](47250/63764) loss:2.877 lr:0.0000011 epoch_Time:102.0min: [2023-12-11 12:37:22,466][model3_sft.py][INFO] Epoch:[0/2](47300/63764) loss:3.525 lr:0.0000011 epoch_Time:101.0min: [2023-12-11 12:37:40,934][model3_sft.py][INFO] Epoch:[0/2](47350/63764) loss:3.178 lr:0.0000011 epoch_Time:101.0min: [2023-12-11 12:37:59,364][model3_sft.py][INFO] Epoch:[0/2](47400/63764) loss:3.814 lr:0.0000011 epoch_Time:101.0min: [2023-12-11 12:38:17,803][model3_sft.py][INFO] Epoch:[0/2](47450/63764) loss:3.230 lr:0.0000011 epoch_Time:100.0min: [2023-12-11 12:38:36,325][model3_sft.py][INFO] Epoch:[0/2](47500/63764) loss:3.427 lr:0.0000011 epoch_Time:100.0min: [2023-12-11 12:38:54,765][model3_sft.py][INFO] Epoch:[0/2](47550/63764) loss:3.502 lr:0.0000011 epoch_Time:100.0min: [2023-12-11 12:39:13,220][model3_sft.py][INFO] Epoch:[0/2](47600/63764) loss:3.002 lr:0.0000011 epoch_Time:100.0min: [2023-12-11 12:39:31,640][model3_sft.py][INFO] Epoch:[0/2](47650/63764) loss:4.006 lr:0.0000011 epoch_Time:99.0min: [2023-12-11 12:39:50,089][model3_sft.py][INFO] Epoch:[0/2](47700/63764) loss:2.830 lr:0.0000011 epoch_Time:99.0min: [2023-12-11 12:40:08,578][model3_sft.py][INFO] Epoch:[0/2](47750/63764) loss:3.699 lr:0.0000011 epoch_Time:99.0min: [2023-12-11 12:40:26,986][model3_sft.py][INFO] Epoch:[0/2](47800/63764) loss:3.747 lr:0.0000011 epoch_Time:98.0min: [2023-12-11 12:40:45,440][model3_sft.py][INFO] Epoch:[0/2](47850/63764) loss:3.100 lr:0.0000011 epoch_Time:98.0min: [2023-12-11 12:41:03,869][model3_sft.py][INFO] Epoch:[0/2](47900/63764) loss:3.054 lr:0.0000011 epoch_Time:98.0min: [2023-12-11 12:41:22,313][model3_sft.py][INFO] Epoch:[0/2](47950/63764) loss:3.738 lr:0.0000011 epoch_Time:97.0min: [2023-12-11 12:41:40,770][model3_sft.py][INFO] Epoch:[0/2](48000/63764) loss:3.797 lr:0.0000011 epoch_Time:97.0min: [2023-12-11 12:41:59,201][model3_sft.py][INFO] Epoch:[0/2](48050/63764) loss:3.371 lr:0.0000011 epoch_Time:97.0min: [2023-12-11 12:42:17,681][model3_sft.py][INFO] Epoch:[0/2](48100/63764) loss:3.762 lr:0.0000011 epoch_Time:96.0min: [2023-12-11 12:42:36,325][model3_sft.py][INFO] Epoch:[0/2](48150/63764) loss:3.212 lr:0.0000011 epoch_Time:96.0min: [2023-12-11 12:42:54,750][model3_sft.py][INFO] Epoch:[0/2](48200/63764) loss:2.789 lr:0.0000011 epoch_Time:96.0min: [2023-12-11 12:43:13,209][model3_sft.py][INFO] Epoch:[0/2](48250/63764) loss:3.404 lr:0.0000011 epoch_Time:96.0min: [2023-12-11 12:43:31,745][model3_sft.py][INFO] Epoch:[0/2](48300/63764) loss:3.545 lr:0.0000011 epoch_Time:95.0min: [2023-12-11 12:43:50,207][model3_sft.py][INFO] Epoch:[0/2](48350/63764) loss:3.512 lr:0.0000011 epoch_Time:95.0min: [2023-12-11 12:44:08,646][model3_sft.py][INFO] Epoch:[0/2](48400/63764) loss:3.274 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 12:44:27,121][model3_sft.py][INFO] Epoch:[0/2](48450/63764) loss:3.629 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 12:44:45,571][model3_sft.py][INFO] Epoch:[0/2](48500/63764) loss:3.258 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 12:45:04,051][model3_sft.py][INFO] Epoch:[0/2](48550/63764) loss:2.913 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 12:45:22,464][model3_sft.py][INFO] Epoch:[0/2](48600/63764) loss:3.829 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 12:45:40,877][model3_sft.py][INFO] Epoch:[0/2](48650/63764) loss:4.001 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 12:45:59,326][model3_sft.py][INFO] Epoch:[0/2](48700/63764) loss:4.103 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 12:46:17,777][model3_sft.py][INFO] Epoch:[0/2](48750/63764) loss:2.885 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 12:46:36,248][model3_sft.py][INFO] Epoch:[0/2](48800/63764) loss:3.443 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 12:46:54,731][model3_sft.py][INFO] Epoch:[0/2](48850/63764) loss:3.791 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 12:47:13,187][model3_sft.py][INFO] Epoch:[0/2](48900/63764) loss:3.901 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 12:47:31,588][model3_sft.py][INFO] Epoch:[0/2](48950/63764) loss:3.917 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 12:47:50,030][model3_sft.py][INFO] Epoch:[0/2](49000/63764) loss:3.209 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 12:48:08,480][model3_sft.py][INFO] Epoch:[0/2](49050/63764) loss:2.375 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 12:48:27,129][model3_sft.py][INFO] Epoch:[0/2](49100/63764) loss:3.266 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 12:48:45,601][model3_sft.py][INFO] Epoch:[0/2](49150/63764) loss:2.916 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 12:49:04,064][model3_sft.py][INFO] Epoch:[0/2](49200/63764) loss:2.657 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 12:49:22,550][model3_sft.py][INFO] Epoch:[0/2](49250/63764) loss:3.577 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 12:49:41,025][model3_sft.py][INFO] Epoch:[0/2](49300/63764) loss:3.043 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 12:49:59,476][model3_sft.py][INFO] Epoch:[0/2](49350/63764) loss:3.071 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 12:50:17,936][model3_sft.py][INFO] Epoch:[0/2](49400/63764) loss:3.178 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 12:50:36,408][model3_sft.py][INFO] Epoch:[0/2](49450/63764) loss:3.536 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 12:50:54,938][model3_sft.py][INFO] Epoch:[0/2](49500/63764) loss:3.597 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 12:51:13,394][model3_sft.py][INFO] Epoch:[0/2](49550/63764) loss:3.296 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 12:51:31,834][model3_sft.py][INFO] Epoch:[0/2](49600/63764) loss:2.971 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 12:51:50,335][model3_sft.py][INFO] Epoch:[0/2](49650/63764) loss:3.437 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 12:52:08,789][model3_sft.py][INFO] Epoch:[0/2](49700/63764) loss:4.294 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 12:52:27,309][model3_sft.py][INFO] Epoch:[0/2](49750/63764) loss:3.106 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 12:52:45,804][model3_sft.py][INFO] Epoch:[0/2](49800/63764) loss:2.847 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 12:53:04,240][model3_sft.py][INFO] Epoch:[0/2](49850/63764) loss:3.120 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 12:53:22,689][model3_sft.py][INFO] Epoch:[0/2](49900/63764) loss:2.919 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 12:53:41,189][model3_sft.py][INFO] Epoch:[0/2](49950/63764) loss:3.556 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 12:53:59,728][model3_sft.py][INFO] Epoch:[0/2](50000/63764) loss:3.111 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 12:54:18,412][model3_sft.py][INFO] Epoch:[0/2](50050/63764) loss:2.994 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 12:54:36,892][model3_sft.py][INFO] Epoch:[0/2](50100/63764) loss:3.526 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 12:54:55,410][model3_sft.py][INFO] Epoch:[0/2](50150/63764) loss:3.046 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 12:55:13,916][model3_sft.py][INFO] Epoch:[0/2](50200/63764) loss:3.543 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 12:55:32,390][model3_sft.py][INFO] Epoch:[0/2](50250/63764) loss:3.151 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 12:55:50,859][model3_sft.py][INFO] Epoch:[0/2](50300/63764) loss:2.644 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 12:56:09,342][model3_sft.py][INFO] Epoch:[0/2](50350/63764) loss:3.418 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 12:56:27,855][model3_sft.py][INFO] Epoch:[0/2](50400/63764) loss:3.268 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 12:56:46,370][model3_sft.py][INFO] Epoch:[0/2](50450/63764) loss:3.303 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 12:57:04,850][model3_sft.py][INFO] Epoch:[0/2](50500/63764) loss:2.905 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 12:57:23,335][model3_sft.py][INFO] Epoch:[0/2](50550/63764) loss:3.296 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 12:57:41,767][model3_sft.py][INFO] Epoch:[0/2](50600/63764) loss:3.572 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 12:58:00,252][model3_sft.py][INFO] Epoch:[0/2](50650/63764) loss:3.065 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 12:58:18,722][model3_sft.py][INFO] Epoch:[0/2](50700/63764) loss:3.420 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 12:58:37,226][model3_sft.py][INFO] Epoch:[0/2](50750/63764) loss:2.508 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 12:58:55,711][model3_sft.py][INFO] Epoch:[0/2](50800/63764) loss:3.750 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 12:59:14,165][model3_sft.py][INFO] Epoch:[0/2](50850/63764) loss:3.676 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 12:59:32,630][model3_sft.py][INFO] Epoch:[0/2](50900/63764) loss:2.730 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 12:59:51,130][model3_sft.py][INFO] Epoch:[0/2](50950/63764) loss:2.983 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 13:00:09,601][model3_sft.py][INFO] Epoch:[0/2](51000/63764) loss:3.158 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 13:00:28,330][model3_sft.py][INFO] Epoch:[0/2](51050/63764) loss:3.882 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 13:00:46,774][model3_sft.py][INFO] Epoch:[0/2](51100/63764) loss:3.430 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 13:01:05,257][model3_sft.py][INFO] Epoch:[0/2](51150/63764) loss:3.842 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 13:01:23,710][model3_sft.py][INFO] Epoch:[0/2](51200/63764) loss:3.268 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 13:01:42,141][model3_sft.py][INFO] Epoch:[0/2](51250/63764) loss:3.112 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 13:02:00,642][model3_sft.py][INFO] Epoch:[0/2](51300/63764) loss:3.460 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 13:02:19,147][model3_sft.py][INFO] Epoch:[0/2](51350/63764) loss:3.331 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 13:02:37,600][model3_sft.py][INFO] Epoch:[0/2](51400/63764) loss:3.121 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 13:02:56,023][model3_sft.py][INFO] Epoch:[0/2](51450/63764) loss:3.217 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 13:03:14,544][model3_sft.py][INFO] Epoch:[0/2](51500/63764) loss:3.162 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 13:03:33,014][model3_sft.py][INFO] Epoch:[0/2](51550/63764) loss:3.628 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 13:03:51,466][model3_sft.py][INFO] Epoch:[0/2](51600/63764) loss:3.569 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 13:04:09,942][model3_sft.py][INFO] Epoch:[0/2](51650/63764) loss:3.147 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 13:04:28,430][model3_sft.py][INFO] Epoch:[0/2](51700/63764) loss:3.233 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 13:04:46,934][model3_sft.py][INFO] Epoch:[0/2](51750/63764) loss:3.714 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 13:05:05,473][model3_sft.py][INFO] Epoch:[0/2](51800/63764) loss:3.405 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 13:05:23,958][model3_sft.py][INFO] Epoch:[0/2](51850/63764) loss:4.060 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 13:05:42,409][model3_sft.py][INFO] Epoch:[0/2](51900/63764) loss:3.828 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 13:06:00,947][model3_sft.py][INFO] Epoch:[0/2](51950/63764) loss:3.500 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 13:06:19,660][model3_sft.py][INFO] Epoch:[0/2](52000/63764) loss:3.700 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 13:06:38,108][model3_sft.py][INFO] Epoch:[0/2](52050/63764) loss:2.995 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 13:06:56,588][model3_sft.py][INFO] Epoch:[0/2](52100/63764) loss:3.896 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 13:07:15,100][model3_sft.py][INFO] Epoch:[0/2](52150/63764) loss:3.190 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 13:07:33,587][model3_sft.py][INFO] Epoch:[0/2](52200/63764) loss:3.522 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 13:07:52,088][model3_sft.py][INFO] Epoch:[0/2](52250/63764) loss:4.292 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 13:08:10,590][model3_sft.py][INFO] Epoch:[0/2](52300/63764) loss:3.049 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 13:08:29,025][model3_sft.py][INFO] Epoch:[0/2](52350/63764) loss:3.448 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 13:08:47,525][model3_sft.py][INFO] Epoch:[0/2](52400/63764) loss:3.402 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 13:09:05,986][model3_sft.py][INFO] Epoch:[0/2](52450/63764) loss:2.968 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 13:09:24,516][model3_sft.py][INFO] Epoch:[0/2](52500/63764) loss:3.259 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 13:09:42,989][model3_sft.py][INFO] Epoch:[0/2](52550/63764) loss:3.704 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 13:10:01,451][model3_sft.py][INFO] Epoch:[0/2](52600/63764) loss:2.816 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 13:10:19,925][model3_sft.py][INFO] Epoch:[0/2](52650/63764) loss:3.516 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 13:10:38,454][model3_sft.py][INFO] Epoch:[0/2](52700/63764) loss:3.512 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 13:10:56,939][model3_sft.py][INFO] Epoch:[0/2](52750/63764) loss:3.454 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 13:11:15,423][model3_sft.py][INFO] Epoch:[0/2](52800/63764) loss:3.061 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 13:11:33,912][model3_sft.py][INFO] Epoch:[0/2](52850/63764) loss:3.215 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 13:11:52,389][model3_sft.py][INFO] Epoch:[0/2](52900/63764) loss:3.310 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 13:12:11,068][model3_sft.py][INFO] Epoch:[0/2](52950/63764) loss:2.856 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 13:12:29,558][model3_sft.py][INFO] Epoch:[0/2](53000/63764) loss:3.226 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 13:12:48,075][model3_sft.py][INFO] Epoch:[0/2](53050/63764) loss:3.395 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 13:13:06,580][model3_sft.py][INFO] Epoch:[0/2](53100/63764) loss:3.672 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 13:13:25,048][model3_sft.py][INFO] Epoch:[0/2](53150/63764) loss:3.463 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 13:13:43,510][model3_sft.py][INFO] Epoch:[0/2](53200/63764) loss:3.196 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 13:14:01,993][model3_sft.py][INFO] Epoch:[0/2](53250/63764) loss:3.417 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 13:14:20,489][model3_sft.py][INFO] Epoch:[0/2](53300/63764) loss:2.991 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 13:14:38,952][model3_sft.py][INFO] Epoch:[0/2](53350/63764) loss:2.743 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 13:14:57,440][model3_sft.py][INFO] Epoch:[0/2](53400/63764) loss:2.856 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 13:15:15,968][model3_sft.py][INFO] Epoch:[0/2](53450/63764) loss:3.233 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 13:15:34,476][model3_sft.py][INFO] Epoch:[0/2](53500/63764) loss:3.622 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 13:15:52,945][model3_sft.py][INFO] Epoch:[0/2](53550/63764) loss:2.955 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 13:16:11,411][model3_sft.py][INFO] Epoch:[0/2](53600/63764) loss:3.516 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 13:16:29,903][model3_sft.py][INFO] Epoch:[0/2](53650/63764) loss:3.084 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 13:16:48,422][model3_sft.py][INFO] Epoch:[0/2](53700/63764) loss:4.001 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 13:17:06,883][model3_sft.py][INFO] Epoch:[0/2](53750/63764) loss:3.391 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 13:17:25,357][model3_sft.py][INFO] Epoch:[0/2](53800/63764) loss:3.135 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 13:17:43,823][model3_sft.py][INFO] Epoch:[0/2](53850/63764) loss:3.248 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 13:18:02,566][model3_sft.py][INFO] Epoch:[0/2](53900/63764) loss:3.124 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 13:18:21,071][model3_sft.py][INFO] Epoch:[0/2](53950/63764) loss:2.417 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 13:18:39,614][model3_sft.py][INFO] Epoch:[0/2](54000/63764) loss:2.824 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 13:18:58,095][model3_sft.py][INFO] Epoch:[0/2](54050/63764) loss:3.204 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 13:19:16,597][model3_sft.py][INFO] Epoch:[0/2](54100/63764) loss:3.447 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 13:19:35,041][model3_sft.py][INFO] Epoch:[0/2](54150/63764) loss:3.037 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 13:19:53,504][model3_sft.py][INFO] Epoch:[0/2](54200/63764) loss:3.928 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 13:20:11,955][model3_sft.py][INFO] Epoch:[0/2](54250/63764) loss:3.414 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 13:20:30,513][model3_sft.py][INFO] Epoch:[0/2](54300/63764) loss:3.287 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 13:20:48,937][model3_sft.py][INFO] Epoch:[0/2](54350/63764) loss:3.576 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 13:21:07,429][model3_sft.py][INFO] Epoch:[0/2](54400/63764) loss:3.604 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 13:21:25,927][model3_sft.py][INFO] Epoch:[0/2](54450/63764) loss:3.217 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 13:21:44,431][model3_sft.py][INFO] Epoch:[0/2](54500/63764) loss:3.030 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 13:22:02,909][model3_sft.py][INFO] Epoch:[0/2](54550/63764) loss:3.582 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 13:22:21,384][model3_sft.py][INFO] Epoch:[0/2](54600/63764) loss:3.033 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 13:22:39,853][model3_sft.py][INFO] Epoch:[0/2](54650/63764) loss:3.933 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 13:22:58,338][model3_sft.py][INFO] Epoch:[0/2](54700/63764) loss:3.471 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 13:23:16,805][model3_sft.py][INFO] Epoch:[0/2](54750/63764) loss:3.393 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 13:23:35,320][model3_sft.py][INFO] Epoch:[0/2](54800/63764) loss:3.442 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 13:23:54,024][model3_sft.py][INFO] Epoch:[0/2](54850/63764) loss:3.124 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 13:24:12,463][model3_sft.py][INFO] Epoch:[0/2](54900/63764) loss:3.757 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 13:24:30,901][model3_sft.py][INFO] Epoch:[0/2](54950/63764) loss:2.718 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 13:24:49,381][model3_sft.py][INFO] Epoch:[0/2](55000/63764) loss:3.361 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 13:25:07,816][model3_sft.py][INFO] Epoch:[0/2](55050/63764) loss:3.243 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 13:25:26,284][model3_sft.py][INFO] Epoch:[0/2](55100/63764) loss:3.483 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 13:25:44,745][model3_sft.py][INFO] Epoch:[0/2](55150/63764) loss:3.596 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 13:26:03,349][model3_sft.py][INFO] Epoch:[0/2](55200/63764) loss:2.810 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 13:26:21,852][model3_sft.py][INFO] Epoch:[0/2](55250/63764) loss:3.380 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 13:26:40,400][model3_sft.py][INFO] Epoch:[0/2](55300/63764) loss:2.984 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 13:26:58,864][model3_sft.py][INFO] Epoch:[0/2](55350/63764) loss:3.483 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 13:27:17,350][model3_sft.py][INFO] Epoch:[0/2](55400/63764) loss:3.483 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 13:27:35,866][model3_sft.py][INFO] Epoch:[0/2](55450/63764) loss:3.084 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 13:27:54,336][model3_sft.py][INFO] Epoch:[0/2](55500/63764) loss:2.928 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 13:28:12,850][model3_sft.py][INFO] Epoch:[0/2](55550/63764) loss:3.342 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 13:28:31,315][model3_sft.py][INFO] Epoch:[0/2](55600/63764) loss:3.372 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 13:28:49,779][model3_sft.py][INFO] Epoch:[0/2](55650/63764) loss:3.513 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 13:29:08,272][model3_sft.py][INFO] Epoch:[0/2](55700/63764) loss:3.425 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 13:29:26,734][model3_sft.py][INFO] Epoch:[0/2](55750/63764) loss:3.435 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 13:29:45,206][model3_sft.py][INFO] Epoch:[0/2](55800/63764) loss:4.055 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 13:30:03,885][model3_sft.py][INFO] Epoch:[0/2](55850/63764) loss:3.197 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 13:30:22,401][model3_sft.py][INFO] Epoch:[0/2](55900/63764) loss:3.055 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 13:30:40,898][model3_sft.py][INFO] Epoch:[0/2](55950/63764) loss:3.111 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 13:30:59,375][model3_sft.py][INFO] Epoch:[0/2](56000/63764) loss:3.289 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 13:31:17,876][model3_sft.py][INFO] Epoch:[0/2](56050/63764) loss:3.148 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 13:31:36,378][model3_sft.py][INFO] Epoch:[0/2](56100/63764) loss:3.430 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 13:31:54,854][model3_sft.py][INFO] Epoch:[0/2](56150/63764) loss:3.196 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 13:32:13,356][model3_sft.py][INFO] Epoch:[0/2](56200/63764) loss:3.046 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 13:32:31,884][model3_sft.py][INFO] Epoch:[0/2](56250/63764) loss:3.472 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 13:32:50,358][model3_sft.py][INFO] Epoch:[0/2](56300/63764) loss:3.276 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 13:33:08,807][model3_sft.py][INFO] Epoch:[0/2](56350/63764) loss:3.041 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 13:33:27,310][model3_sft.py][INFO] Epoch:[0/2](56400/63764) loss:3.735 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 13:33:45,826][model3_sft.py][INFO] Epoch:[0/2](56450/63764) loss:3.691 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 13:34:04,402][model3_sft.py][INFO] Epoch:[0/2](56500/63764) loss:2.902 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 13:34:22,881][model3_sft.py][INFO] Epoch:[0/2](56550/63764) loss:3.551 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 13:34:41,372][model3_sft.py][INFO] Epoch:[0/2](56600/63764) loss:2.751 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 13:34:59,876][model3_sft.py][INFO] Epoch:[0/2](56650/63764) loss:2.756 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 13:35:18,423][model3_sft.py][INFO] Epoch:[0/2](56700/63764) loss:3.129 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 13:35:36,868][model3_sft.py][INFO] Epoch:[0/2](56750/63764) loss:3.735 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 13:35:55,541][model3_sft.py][INFO] Epoch:[0/2](56800/63764) loss:3.005 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 13:36:14,047][model3_sft.py][INFO] Epoch:[0/2](56850/63764) loss:3.857 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 13:36:32,516][model3_sft.py][INFO] Epoch:[0/2](56900/63764) loss:3.452 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 13:36:50,972][model3_sft.py][INFO] Epoch:[0/2](56950/63764) loss:3.136 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 13:37:09,432][model3_sft.py][INFO] Epoch:[0/2](57000/63764) loss:2.620 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 13:37:27,954][model3_sft.py][INFO] Epoch:[0/2](57050/63764) loss:3.547 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 13:37:46,393][model3_sft.py][INFO] Epoch:[0/2](57100/63764) loss:2.944 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 13:38:04,889][model3_sft.py][INFO] Epoch:[0/2](57150/63764) loss:2.861 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 13:38:23,352][model3_sft.py][INFO] Epoch:[0/2](57200/63764) loss:3.607 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 13:38:41,797][model3_sft.py][INFO] Epoch:[0/2](57250/63764) loss:3.845 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 13:39:00,293][model3_sft.py][INFO] Epoch:[0/2](57300/63764) loss:4.048 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 13:39:18,829][model3_sft.py][INFO] Epoch:[0/2](57350/63764) loss:3.205 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 13:39:37,280][model3_sft.py][INFO] Epoch:[0/2](57400/63764) loss:2.844 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 13:39:55,763][model3_sft.py][INFO] Epoch:[0/2](57450/63764) loss:3.165 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 13:40:14,235][model3_sft.py][INFO] Epoch:[0/2](57500/63764) loss:3.173 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 13:40:32,687][model3_sft.py][INFO] Epoch:[0/2](57550/63764) loss:3.415 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 13:40:51,177][model3_sft.py][INFO] Epoch:[0/2](57600/63764) loss:3.376 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 13:41:09,667][model3_sft.py][INFO] Epoch:[0/2](57650/63764) loss:3.484 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 13:41:28,179][model3_sft.py][INFO] Epoch:[0/2](57700/63764) loss:2.982 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 13:41:46,862][model3_sft.py][INFO] Epoch:[0/2](57750/63764) loss:3.264 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 13:42:05,353][model3_sft.py][INFO] Epoch:[0/2](57800/63764) loss:3.225 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 13:42:23,809][model3_sft.py][INFO] Epoch:[0/2](57850/63764) loss:2.955 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 13:42:42,270][model3_sft.py][INFO] Epoch:[0/2](57900/63764) loss:3.297 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 13:43:00,763][model3_sft.py][INFO] Epoch:[0/2](57950/63764) loss:2.860 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 13:43:19,213][model3_sft.py][INFO] Epoch:[0/2](58000/63764) loss:3.308 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 13:43:37,679][model3_sft.py][INFO] Epoch:[0/2](58050/63764) loss:4.467 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 13:43:56,094][model3_sft.py][INFO] Epoch:[0/2](58100/63764) loss:3.719 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 13:44:14,563][model3_sft.py][INFO] Epoch:[0/2](58150/63764) loss:3.087 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 13:44:33,018][model3_sft.py][INFO] Epoch:[0/2](58200/63764) loss:3.239 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 13:44:51,535][model3_sft.py][INFO] Epoch:[0/2](58250/63764) loss:3.821 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 13:45:09,987][model3_sft.py][INFO] Epoch:[0/2](58300/63764) loss:3.597 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 13:45:28,443][model3_sft.py][INFO] Epoch:[0/2](58350/63764) loss:3.824 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 13:45:46,982][model3_sft.py][INFO] Epoch:[0/2](58400/63764) loss:3.445 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 13:46:05,486][model3_sft.py][INFO] Epoch:[0/2](58450/63764) loss:2.627 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 13:46:23,957][model3_sft.py][INFO] Epoch:[0/2](58500/63764) loss:3.477 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 13:46:42,431][model3_sft.py][INFO] Epoch:[0/2](58550/63764) loss:3.227 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 13:47:00,891][model3_sft.py][INFO] Epoch:[0/2](58600/63764) loss:3.297 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 13:47:19,378][model3_sft.py][INFO] Epoch:[0/2](58650/63764) loss:2.883 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 13:47:38,036][model3_sft.py][INFO] Epoch:[0/2](58700/63764) loss:3.339 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 13:47:56,559][model3_sft.py][INFO] Epoch:[0/2](58750/63764) loss:3.417 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 13:48:15,033][model3_sft.py][INFO] Epoch:[0/2](58800/63764) loss:3.869 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 13:48:33,464][model3_sft.py][INFO] Epoch:[0/2](58850/63764) loss:2.874 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 13:48:51,947][model3_sft.py][INFO] Epoch:[0/2](58900/63764) loss:3.595 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 13:49:10,414][model3_sft.py][INFO] Epoch:[0/2](58950/63764) loss:3.730 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 13:49:28,876][model3_sft.py][INFO] Epoch:[0/2](59000/63764) loss:3.759 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 13:49:47,348][model3_sft.py][INFO] Epoch:[0/2](59050/63764) loss:3.563 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 13:50:05,825][model3_sft.py][INFO] Epoch:[0/2](59100/63764) loss:2.957 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 13:50:24,286][model3_sft.py][INFO] Epoch:[0/2](59150/63764) loss:4.045 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 13:50:42,745][model3_sft.py][INFO] Epoch:[0/2](59200/63764) loss:3.387 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 13:51:01,205][model3_sft.py][INFO] Epoch:[0/2](59250/63764) loss:3.134 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 13:51:19,664][model3_sft.py][INFO] Epoch:[0/2](59300/63764) loss:3.232 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 13:51:38,115][model3_sft.py][INFO] Epoch:[0/2](59350/63764) loss:2.887 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 13:51:56,559][model3_sft.py][INFO] Epoch:[0/2](59400/63764) loss:3.228 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 13:52:15,047][model3_sft.py][INFO] Epoch:[0/2](59450/63764) loss:3.115 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 13:52:33,502][model3_sft.py][INFO] Epoch:[0/2](59500/63764) loss:3.493 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 13:52:51,969][model3_sft.py][INFO] Epoch:[0/2](59550/63764) loss:3.109 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 13:53:10,394][model3_sft.py][INFO] Epoch:[0/2](59600/63764) loss:3.621 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 13:53:29,029][model3_sft.py][INFO] Epoch:[0/2](59650/63764) loss:3.014 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 13:53:47,486][model3_sft.py][INFO] Epoch:[0/2](59700/63764) loss:3.519 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 13:54:05,946][model3_sft.py][INFO] Epoch:[0/2](59750/63764) loss:3.894 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 13:54:24,428][model3_sft.py][INFO] Epoch:[0/2](59800/63764) loss:3.533 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 13:54:42,874][model3_sft.py][INFO] Epoch:[0/2](59850/63764) loss:2.477 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 13:55:01,305][model3_sft.py][INFO] Epoch:[0/2](59900/63764) loss:2.848 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 13:55:19,800][model3_sft.py][INFO] Epoch:[0/2](59950/63764) loss:3.185 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 13:55:38,247][model3_sft.py][INFO] Epoch:[0/2](60000/63764) loss:3.502 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 13:55:56,717][model3_sft.py][INFO] Epoch:[0/2](60050/63764) loss:3.818 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 13:56:15,197][model3_sft.py][INFO] Epoch:[0/2](60100/63764) loss:3.140 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 13:56:33,721][model3_sft.py][INFO] Epoch:[0/2](60150/63764) loss:3.193 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 13:56:52,178][model3_sft.py][INFO] Epoch:[0/2](60200/63764) loss:3.618 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 13:57:10,681][model3_sft.py][INFO] Epoch:[0/2](60250/63764) loss:3.423 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 13:57:29,203][model3_sft.py][INFO] Epoch:[0/2](60300/63764) loss:3.198 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 13:57:47,719][model3_sft.py][INFO] Epoch:[0/2](60350/63764) loss:3.280 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 13:58:06,167][model3_sft.py][INFO] Epoch:[0/2](60400/63764) loss:3.284 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 13:58:24,640][model3_sft.py][INFO] Epoch:[0/2](60450/63764) loss:3.070 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 13:58:43,117][model3_sft.py][INFO] Epoch:[0/2](60500/63764) loss:4.047 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 13:59:01,626][model3_sft.py][INFO] Epoch:[0/2](60550/63764) loss:2.978 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 13:59:20,134][model3_sft.py][INFO] Epoch:[0/2](60600/63764) loss:2.676 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 13:59:38,785][model3_sft.py][INFO] Epoch:[0/2](60650/63764) loss:3.657 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 13:59:57,187][model3_sft.py][INFO] Epoch:[0/2](60700/63764) loss:2.665 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 14:00:15,580][model3_sft.py][INFO] Epoch:[0/2](60750/63764) loss:3.428 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 14:00:33,986][model3_sft.py][INFO] Epoch:[0/2](60800/63764) loss:3.405 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 14:00:52,381][model3_sft.py][INFO] Epoch:[0/2](60850/63764) loss:3.465 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 14:01:10,798][model3_sft.py][INFO] Epoch:[0/2](60900/63764) loss:2.284 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 14:01:29,240][model3_sft.py][INFO] Epoch:[0/2](60950/63764) loss:3.062 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 14:01:47,632][model3_sft.py][INFO] Epoch:[0/2](61000/63764) loss:3.032 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 14:02:06,053][model3_sft.py][INFO] Epoch:[0/2](61050/63764) loss:2.949 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 14:02:24,466][model3_sft.py][INFO] Epoch:[0/2](61100/63764) loss:3.341 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 14:02:42,906][model3_sft.py][INFO] Epoch:[0/2](61150/63764) loss:3.834 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 14:03:01,290][model3_sft.py][INFO] Epoch:[0/2](61200/63764) loss:2.983 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 14:03:19,704][model3_sft.py][INFO] Epoch:[0/2](61250/63764) loss:3.232 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 14:03:38,148][model3_sft.py][INFO] Epoch:[0/2](61300/63764) loss:3.178 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 14:03:56,579][model3_sft.py][INFO] Epoch:[0/2](61350/63764) loss:2.741 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 14:04:15,019][model3_sft.py][INFO] Epoch:[0/2](61400/63764) loss:2.704 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 14:04:33,480][model3_sft.py][INFO] Epoch:[0/2](61450/63764) loss:3.150 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 14:04:51,884][model3_sft.py][INFO] Epoch:[0/2](61500/63764) loss:3.684 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 14:05:10,285][model3_sft.py][INFO] Epoch:[0/2](61550/63764) loss:3.123 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 14:05:28,930][model3_sft.py][INFO] Epoch:[0/2](61600/63764) loss:4.189 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 14:05:47,360][model3_sft.py][INFO] Epoch:[0/2](61650/63764) loss:3.530 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 14:06:05,783][model3_sft.py][INFO] Epoch:[0/2](61700/63764) loss:3.122 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 14:06:24,177][model3_sft.py][INFO] Epoch:[0/2](61750/63764) loss:3.708 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 14:06:42,572][model3_sft.py][INFO] Epoch:[0/2](61800/63764) loss:3.510 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 14:07:00,997][model3_sft.py][INFO] Epoch:[0/2](61850/63764) loss:3.257 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 14:07:19,388][model3_sft.py][INFO] Epoch:[0/2](61900/63764) loss:3.443 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 14:07:37,806][model3_sft.py][INFO] Epoch:[0/2](61950/63764) loss:3.434 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 14:07:56,188][model3_sft.py][INFO] Epoch:[0/2](62000/63764) loss:3.464 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 14:08:14,586][model3_sft.py][INFO] Epoch:[0/2](62050/63764) loss:3.591 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 14:08:32,954][model3_sft.py][INFO] Epoch:[0/2](62100/63764) loss:3.283 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 14:08:51,341][model3_sft.py][INFO] Epoch:[0/2](62150/63764) loss:3.260 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 14:09:09,722][model3_sft.py][INFO] Epoch:[0/2](62200/63764) loss:3.072 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 14:09:28,197][model3_sft.py][INFO] Epoch:[0/2](62250/63764) loss:3.037 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 14:09:46,610][model3_sft.py][INFO] Epoch:[0/2](62300/63764) loss:3.777 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 14:10:04,981][model3_sft.py][INFO] Epoch:[0/2](62350/63764) loss:3.830 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 14:10:23,388][model3_sft.py][INFO] Epoch:[0/2](62400/63764) loss:3.531 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 14:10:41,771][model3_sft.py][INFO] Epoch:[0/2](62450/63764) loss:3.167 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 14:11:00,192][model3_sft.py][INFO] Epoch:[0/2](62500/63764) loss:2.679 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 14:11:18,820][model3_sft.py][INFO] Epoch:[0/2](62550/63764) loss:2.924 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 14:11:37,254][model3_sft.py][INFO] Epoch:[0/2](62600/63764) loss:3.770 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 14:11:55,618][model3_sft.py][INFO] Epoch:[0/2](62650/63764) loss:3.701 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 14:12:14,048][model3_sft.py][INFO] Epoch:[0/2](62700/63764) loss:3.000 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 14:12:32,397][model3_sft.py][INFO] Epoch:[0/2](62750/63764) loss:3.326 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 14:12:50,796][model3_sft.py][INFO] Epoch:[0/2](62800/63764) loss:3.765 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 14:13:09,182][model3_sft.py][INFO] Epoch:[0/2](62850/63764) loss:3.166 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 14:13:27,601][model3_sft.py][INFO] Epoch:[0/2](62900/63764) loss:2.994 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 14:13:46,017][model3_sft.py][INFO] Epoch:[0/2](62950/63764) loss:3.068 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 14:14:04,411][model3_sft.py][INFO] Epoch:[0/2](63000/63764) loss:2.815 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 14:14:22,796][model3_sft.py][INFO] Epoch:[0/2](63050/63764) loss:3.241 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 14:14:41,238][model3_sft.py][INFO] Epoch:[0/2](63100/63764) loss:3.562 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 14:14:59,633][model3_sft.py][INFO] Epoch:[0/2](63150/63764) loss:2.956 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 14:15:18,029][model3_sft.py][INFO] Epoch:[0/2](63200/63764) loss:2.851 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 14:15:36,430][model3_sft.py][INFO] Epoch:[0/2](63250/63764) loss:3.913 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 14:15:54,881][model3_sft.py][INFO] Epoch:[0/2](63300/63764) loss:3.778 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 14:16:13,260][model3_sft.py][INFO] Epoch:[0/2](63350/63764) loss:2.834 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 14:16:31,664][model3_sft.py][INFO] Epoch:[0/2](63400/63764) loss:3.405 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 14:16:50,076][model3_sft.py][INFO] Epoch:[0/2](63450/63764) loss:3.452 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 14:17:08,719][model3_sft.py][INFO] Epoch:[0/2](63500/63764) loss:3.837 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 14:17:27,116][model3_sft.py][INFO] Epoch:[0/2](63550/63764) loss:3.537 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 14:17:45,548][model3_sft.py][INFO] Epoch:[0/2](63600/63764) loss:2.972 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 14:18:03,968][model3_sft.py][INFO] Epoch:[0/2](63650/63764) loss:3.310 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 14:18:22,352][model3_sft.py][INFO] Epoch:[0/2](63700/63764) loss:3.301 lr:0.0000010 epoch_Time:0.0min: [2023-12-11 14:18:40,780][model3_sft.py][INFO] Epoch:[0/2](63750/63764) loss:4.019 lr:0.0000010 epoch_Time:0.0min: [2023-12-11 14:18:48,102][model3_sft.py][INFO] Epoch:[1/2](0/63764) loss:3.421 lr:0.0000010 epoch_Time:429.0min: [2023-12-11 14:19:06,516][model3_sft.py][INFO] Epoch:[1/2](50/63764) loss:3.418 lr:0.0000010 epoch_Time:392.0min: [2023-12-11 14:19:24,902][model3_sft.py][INFO] Epoch:[1/2](100/63764) loss:2.838 lr:0.0000010 epoch_Time:391.0min: [2023-12-11 14:19:43,307][model3_sft.py][INFO] Epoch:[1/2](150/63764) loss:3.276 lr:0.0000010 epoch_Time:391.0min: [2023-12-11 14:20:01,742][model3_sft.py][INFO] Epoch:[1/2](200/63764) loss:3.295 lr:0.0000010 epoch_Time:390.0min: [2023-12-11 14:20:20,109][model3_sft.py][INFO] Epoch:[1/2](250/63764) loss:3.338 lr:0.0000010 epoch_Time:390.0min: [2023-12-11 14:20:38,487][model3_sft.py][INFO] Epoch:[1/2](300/63764) loss:3.093 lr:0.0000010 epoch_Time:390.0min: [2023-12-11 14:20:56,892][model3_sft.py][INFO] Epoch:[1/2](350/63764) loss:3.280 lr:0.0000010 epoch_Time:389.0min: [2023-12-11 14:21:15,318][model3_sft.py][INFO] Epoch:[1/2](400/63764) loss:3.360 lr:0.0000010 epoch_Time:389.0min: [2023-12-11 14:21:33,721][model3_sft.py][INFO] Epoch:[1/2](450/63764) loss:4.180 lr:0.0000010 epoch_Time:389.0min: [2023-12-11 14:21:52,133][model3_sft.py][INFO] Epoch:[1/2](500/63764) loss:3.121 lr:0.0000010 epoch_Time:388.0min: [2023-12-11 14:22:10,604][model3_sft.py][INFO] Epoch:[1/2](550/63764) loss:3.046 lr:0.0000010 epoch_Time:388.0min: [2023-12-11 14:22:29,026][model3_sft.py][INFO] Epoch:[1/2](600/63764) loss:3.009 lr:0.0000010 epoch_Time:388.0min: [2023-12-11 14:22:47,429][model3_sft.py][INFO] Epoch:[1/2](650/63764) loss:2.590 lr:0.0000010 epoch_Time:388.0min: [2023-12-11 14:23:06,020][model3_sft.py][INFO] Epoch:[1/2](700/63764) loss:3.321 lr:0.0000010 epoch_Time:387.0min: [2023-12-11 14:23:24,414][model3_sft.py][INFO] Epoch:[1/2](750/63764) loss:2.558 lr:0.0000010 epoch_Time:387.0min: [2023-12-11 14:23:42,797][model3_sft.py][INFO] Epoch:[1/2](800/63764) loss:3.010 lr:0.0000010 epoch_Time:387.0min: [2023-12-11 14:24:01,187][model3_sft.py][INFO] Epoch:[1/2](850/63764) loss:3.534 lr:0.0000010 epoch_Time:386.0min: [2023-12-11 14:24:19,590][model3_sft.py][INFO] Epoch:[1/2](900/63764) loss:3.209 lr:0.0000010 epoch_Time:386.0min: [2023-12-11 14:24:38,003][model3_sft.py][INFO] Epoch:[1/2](950/63764) loss:2.960 lr:0.0000010 epoch_Time:386.0min: [2023-12-11 14:24:56,367][model3_sft.py][INFO] Epoch:[1/2](1000/63764) loss:3.281 lr:0.0000010 epoch_Time:385.0min: [2023-12-11 14:25:14,783][model3_sft.py][INFO] Epoch:[1/2](1050/63764) loss:3.084 lr:0.0000010 epoch_Time:385.0min: [2023-12-11 14:25:33,150][model3_sft.py][INFO] Epoch:[1/2](1100/63764) loss:3.242 lr:0.0000010 epoch_Time:385.0min: [2023-12-11 14:25:51,579][model3_sft.py][INFO] Epoch:[1/2](1150/63764) loss:2.800 lr:0.0000010 epoch_Time:384.0min: [2023-12-11 14:26:09,997][model3_sft.py][INFO] Epoch:[1/2](1200/63764) loss:3.670 lr:0.0000010 epoch_Time:384.0min: [2023-12-11 14:26:28,409][model3_sft.py][INFO] Epoch:[1/2](1250/63764) loss:2.816 lr:0.0000010 epoch_Time:384.0min: [2023-12-11 14:26:46,815][model3_sft.py][INFO] Epoch:[1/2](1300/63764) loss:2.757 lr:0.0000010 epoch_Time:384.0min: [2023-12-11 14:27:05,237][model3_sft.py][INFO] Epoch:[1/2](1350/63764) loss:2.949 lr:0.0000010 epoch_Time:383.0min: [2023-12-11 14:27:23,631][model3_sft.py][INFO] Epoch:[1/2](1400/63764) loss:3.185 lr:0.0000010 epoch_Time:383.0min: [2023-12-11 14:27:42,018][model3_sft.py][INFO] Epoch:[1/2](1450/63764) loss:2.812 lr:0.0000010 epoch_Time:383.0min: [2023-12-11 14:28:00,397][model3_sft.py][INFO] Epoch:[1/2](1500/63764) loss:3.642 lr:0.0000010 epoch_Time:382.0min: [2023-12-11 14:28:18,826][model3_sft.py][INFO] Epoch:[1/2](1550/63764) loss:3.063 lr:0.0000010 epoch_Time:382.0min: [2023-12-11 14:28:37,284][model3_sft.py][INFO] Epoch:[1/2](1600/63764) loss:3.235 lr:0.0000010 epoch_Time:382.0min: [2023-12-11 14:28:55,858][model3_sft.py][INFO] Epoch:[1/2](1650/63764) loss:3.123 lr:0.0000010 epoch_Time:381.0min: [2023-12-11 14:29:14,262][model3_sft.py][INFO] Epoch:[1/2](1700/63764) loss:3.191 lr:0.0000010 epoch_Time:381.0min: [2023-12-11 14:29:32,624][model3_sft.py][INFO] Epoch:[1/2](1750/63764) loss:2.966 lr:0.0000010 epoch_Time:381.0min: [2023-12-11 14:29:51,017][model3_sft.py][INFO] Epoch:[1/2](1800/63764) loss:3.451 lr:0.0000010 epoch_Time:380.0min: [2023-12-11 14:30:09,443][model3_sft.py][INFO] Epoch:[1/2](1850/63764) loss:2.764 lr:0.0000010 epoch_Time:380.0min: [2023-12-11 14:30:27,864][model3_sft.py][INFO] Epoch:[1/2](1900/63764) loss:3.420 lr:0.0000010 epoch_Time:380.0min: [2023-12-11 14:30:46,241][model3_sft.py][INFO] Epoch:[1/2](1950/63764) loss:2.705 lr:0.0000010 epoch_Time:380.0min: [2023-12-11 14:31:04,653][model3_sft.py][INFO] Epoch:[1/2](2000/63764) loss:2.430 lr:0.0000010 epoch_Time:379.0min: [2023-12-11 14:31:23,025][model3_sft.py][INFO] Epoch:[1/2](2050/63764) loss:2.911 lr:0.0000010 epoch_Time:379.0min: [2023-12-11 14:31:41,412][model3_sft.py][INFO] Epoch:[1/2](2100/63764) loss:2.639 lr:0.0000010 epoch_Time:379.0min: [2023-12-11 14:31:59,790][model3_sft.py][INFO] Epoch:[1/2](2150/63764) loss:2.735 lr:0.0000010 epoch_Time:378.0min: [2023-12-11 14:32:18,199][model3_sft.py][INFO] Epoch:[1/2](2200/63764) loss:3.193 lr:0.0000010 epoch_Time:378.0min: [2023-12-11 14:32:36,604][model3_sft.py][INFO] Epoch:[1/2](2250/63764) loss:3.008 lr:0.0000010 epoch_Time:378.0min: [2023-12-11 14:32:55,008][model3_sft.py][INFO] Epoch:[1/2](2300/63764) loss:2.577 lr:0.0000010 epoch_Time:377.0min: [2023-12-11 14:33:13,388][model3_sft.py][INFO] Epoch:[1/2](2350/63764) loss:2.695 lr:0.0000010 epoch_Time:377.0min: [2023-12-11 14:33:31,747][model3_sft.py][INFO] Epoch:[1/2](2400/63764) loss:2.132 lr:0.0000010 epoch_Time:377.0min: [2023-12-11 14:33:50,143][model3_sft.py][INFO] Epoch:[1/2](2450/63764) loss:2.800 lr:0.0000010 epoch_Time:376.0min: [2023-12-11 14:34:08,567][model3_sft.py][INFO] Epoch:[1/2](2500/63764) loss:3.424 lr:0.0000010 epoch_Time:376.0min: [2023-12-11 14:34:26,996][model3_sft.py][INFO] Epoch:[1/2](2550/63764) loss:3.420 lr:0.0000010 epoch_Time:376.0min: [2023-12-11 14:34:45,664][model3_sft.py][INFO] Epoch:[1/2](2600/63764) loss:3.192 lr:0.0000010 epoch_Time:376.0min: [2023-12-11 14:35:04,039][model3_sft.py][INFO] Epoch:[1/2](2650/63764) loss:2.293 lr:0.0000010 epoch_Time:375.0min: [2023-12-11 14:35:22,421][model3_sft.py][INFO] Epoch:[1/2](2700/63764) loss:3.232 lr:0.0000010 epoch_Time:375.0min: [2023-12-11 14:35:40,775][model3_sft.py][INFO] Epoch:[1/2](2750/63764) loss:2.950 lr:0.0000010 epoch_Time:375.0min: [2023-12-11 14:35:59,162][model3_sft.py][INFO] Epoch:[1/2](2800/63764) loss:3.381 lr:0.0000010 epoch_Time:374.0min: [2023-12-11 14:36:17,599][model3_sft.py][INFO] Epoch:[1/2](2850/63764) loss:3.323 lr:0.0000010 epoch_Time:374.0min: [2023-12-11 14:36:35,965][model3_sft.py][INFO] Epoch:[1/2](2900/63764) loss:2.557 lr:0.0000010 epoch_Time:374.0min: [2023-12-11 14:36:54,360][model3_sft.py][INFO] Epoch:[1/2](2950/63764) loss:3.370 lr:0.0000010 epoch_Time:373.0min: [2023-12-11 14:37:12,769][model3_sft.py][INFO] Epoch:[1/2](3000/63764) loss:3.607 lr:0.0000010 epoch_Time:373.0min: [2023-12-11 14:37:31,181][model3_sft.py][INFO] Epoch:[1/2](3050/63764) loss:3.463 lr:0.0000010 epoch_Time:373.0min: [2023-12-11 14:37:49,586][model3_sft.py][INFO] Epoch:[1/2](3100/63764) loss:2.966 lr:0.0000010 epoch_Time:372.0min: [2023-12-11 14:38:08,002][model3_sft.py][INFO] Epoch:[1/2](3150/63764) loss:3.221 lr:0.0000010 epoch_Time:372.0min: [2023-12-11 14:38:26,380][model3_sft.py][INFO] Epoch:[1/2](3200/63764) loss:3.225 lr:0.0000010 epoch_Time:372.0min: [2023-12-11 14:38:44,776][model3_sft.py][INFO] Epoch:[1/2](3250/63764) loss:3.171 lr:0.0000010 epoch_Time:372.0min: [2023-12-11 14:39:03,167][model3_sft.py][INFO] Epoch:[1/2](3300/63764) loss:2.607 lr:0.0000010 epoch_Time:371.0min: [2023-12-11 14:39:21,573][model3_sft.py][INFO] Epoch:[1/2](3350/63764) loss:3.606 lr:0.0000010 epoch_Time:371.0min: [2023-12-11 14:39:40,007][model3_sft.py][INFO] Epoch:[1/2](3400/63764) loss:2.467 lr:0.0000010 epoch_Time:371.0min: [2023-12-11 14:39:58,399][model3_sft.py][INFO] Epoch:[1/2](3450/63764) loss:3.387 lr:0.0000010 epoch_Time:370.0min: [2023-12-11 14:40:16,764][model3_sft.py][INFO] Epoch:[1/2](3500/63764) loss:2.719 lr:0.0000010 epoch_Time:370.0min: [2023-12-11 14:40:35,179][model3_sft.py][INFO] Epoch:[1/2](3550/63764) loss:2.651 lr:0.0000010 epoch_Time:370.0min: [2023-12-11 14:40:53,813][model3_sft.py][INFO] Epoch:[1/2](3600/63764) loss:3.208 lr:0.0000010 epoch_Time:369.0min: [2023-12-11 14:41:12,217][model3_sft.py][INFO] Epoch:[1/2](3650/63764) loss:2.707 lr:0.0000010 epoch_Time:369.0min: [2023-12-11 14:41:30,637][model3_sft.py][INFO] Epoch:[1/2](3700/63764) loss:3.867 lr:0.0000010 epoch_Time:369.0min: [2023-12-11 14:41:49,041][model3_sft.py][INFO] Epoch:[1/2](3750/63764) loss:3.272 lr:0.0000010 epoch_Time:368.0min: [2023-12-11 14:42:07,448][model3_sft.py][INFO] Epoch:[1/2](3800/63764) loss:3.098 lr:0.0000010 epoch_Time:368.0min: [2023-12-11 14:42:25,828][model3_sft.py][INFO] Epoch:[1/2](3850/63764) loss:2.813 lr:0.0000010 epoch_Time:368.0min: [2023-12-11 14:42:44,243][model3_sft.py][INFO] Epoch:[1/2](3900/63764) loss:3.328 lr:0.0000010 epoch_Time:368.0min: [2023-12-11 14:43:02,630][model3_sft.py][INFO] Epoch:[1/2](3950/63764) loss:2.775 lr:0.0000010 epoch_Time:367.0min: [2023-12-11 14:43:20,994][model3_sft.py][INFO] Epoch:[1/2](4000/63764) loss:2.394 lr:0.0000010 epoch_Time:367.0min: [2023-12-11 14:43:39,404][model3_sft.py][INFO] Epoch:[1/2](4050/63764) loss:2.791 lr:0.0000010 epoch_Time:367.0min: [2023-12-11 14:43:57,804][model3_sft.py][INFO] Epoch:[1/2](4100/63764) loss:3.460 lr:0.0000010 epoch_Time:366.0min: [2023-12-11 14:44:16,246][model3_sft.py][INFO] Epoch:[1/2](4150/63764) loss:3.041 lr:0.0000010 epoch_Time:366.0min: [2023-12-11 14:44:34,672][model3_sft.py][INFO] Epoch:[1/2](4200/63764) loss:3.056 lr:0.0000010 epoch_Time:366.0min: [2023-12-11 14:44:53,101][model3_sft.py][INFO] Epoch:[1/2](4250/63764) loss:3.172 lr:0.0000010 epoch_Time:365.0min: [2023-12-11 14:45:11,507][model3_sft.py][INFO] Epoch:[1/2](4300/63764) loss:3.164 lr:0.0000010 epoch_Time:365.0min: [2023-12-11 14:45:29,946][model3_sft.py][INFO] Epoch:[1/2](4350/63764) loss:3.121 lr:0.0000010 epoch_Time:365.0min: [2023-12-11 14:45:48,348][model3_sft.py][INFO] Epoch:[1/2](4400/63764) loss:3.555 lr:0.0000010 epoch_Time:364.0min: [2023-12-11 14:46:06,741][model3_sft.py][INFO] Epoch:[1/2](4450/63764) loss:2.737 lr:0.0000010 epoch_Time:364.0min: [2023-12-11 14:46:25,163][model3_sft.py][INFO] Epoch:[1/2](4500/63764) loss:3.476 lr:0.0000010 epoch_Time:364.0min: [2023-12-11 14:46:43,782][model3_sft.py][INFO] Epoch:[1/2](4550/63764) loss:3.535 lr:0.0000010 epoch_Time:364.0min: [2023-12-11 14:47:02,135][model3_sft.py][INFO] Epoch:[1/2](4600/63764) loss:2.949 lr:0.0000010 epoch_Time:363.0min: [2023-12-11 14:47:20,541][model3_sft.py][INFO] Epoch:[1/2](4650/63764) loss:3.154 lr:0.0000010 epoch_Time:363.0min: [2023-12-11 14:47:38,974][model3_sft.py][INFO] Epoch:[1/2](4700/63764) loss:3.528 lr:0.0000010 epoch_Time:363.0min: [2023-12-11 14:47:57,448][model3_sft.py][INFO] Epoch:[1/2](4750/63764) loss:2.737 lr:0.0000010 epoch_Time:362.0min: [2023-12-11 14:48:15,844][model3_sft.py][INFO] Epoch:[1/2](4800/63764) loss:3.345 lr:0.0000010 epoch_Time:362.0min: [2023-12-11 14:48:34,264][model3_sft.py][INFO] Epoch:[1/2](4850/63764) loss:3.396 lr:0.0000010 epoch_Time:362.0min: [2023-12-11 14:48:52,711][model3_sft.py][INFO] Epoch:[1/2](4900/63764) loss:2.736 lr:0.0000010 epoch_Time:361.0min: [2023-12-11 14:49:11,076][model3_sft.py][INFO] Epoch:[1/2](4950/63764) loss:3.357 lr:0.0000010 epoch_Time:361.0min: [2023-12-11 14:49:29,480][model3_sft.py][INFO] Epoch:[1/2](5000/63764) loss:3.634 lr:0.0000010 epoch_Time:361.0min: [2023-12-11 14:49:47,905][model3_sft.py][INFO] Epoch:[1/2](5050/63764) loss:2.950 lr:0.0000010 epoch_Time:360.0min: [2023-12-11 14:50:06,333][model3_sft.py][INFO] Epoch:[1/2](5100/63764) loss:2.940 lr:0.0000010 epoch_Time:360.0min: [2023-12-11 14:50:24,735][model3_sft.py][INFO] Epoch:[1/2](5150/63764) loss:4.395 lr:0.0000010 epoch_Time:360.0min: [2023-12-11 14:50:43,111][model3_sft.py][INFO] Epoch:[1/2](5200/63764) loss:2.779 lr:0.0000010 epoch_Time:360.0min: [2023-12-11 14:51:01,469][model3_sft.py][INFO] Epoch:[1/2](5250/63764) loss:3.536 lr:0.0000010 epoch_Time:359.0min: [2023-12-11 14:51:19,911][model3_sft.py][INFO] Epoch:[1/2](5300/63764) loss:3.553 lr:0.0000010 epoch_Time:359.0min: [2023-12-11 14:51:38,305][model3_sft.py][INFO] Epoch:[1/2](5350/63764) loss:3.120 lr:0.0000010 epoch_Time:359.0min: [2023-12-11 14:51:56,704][model3_sft.py][INFO] Epoch:[1/2](5400/63764) loss:3.071 lr:0.0000010 epoch_Time:358.0min: [2023-12-11 14:52:15,111][model3_sft.py][INFO] Epoch:[1/2](5450/63764) loss:2.909 lr:0.0000010 epoch_Time:358.0min: [2023-12-11 14:52:33,739][model3_sft.py][INFO] Epoch:[1/2](5500/63764) loss:3.111 lr:0.0000010 epoch_Time:358.0min: [2023-12-11 14:52:52,129][model3_sft.py][INFO] Epoch:[1/2](5550/63764) loss:2.699 lr:0.0000010 epoch_Time:357.0min: [2023-12-11 14:53:10,525][model3_sft.py][INFO] Epoch:[1/2](5600/63764) loss:2.901 lr:0.0000010 epoch_Time:357.0min: [2023-12-11 14:53:28,915][model3_sft.py][INFO] Epoch:[1/2](5650/63764) loss:3.286 lr:0.0000010 epoch_Time:357.0min: [2023-12-11 14:53:47,336][model3_sft.py][INFO] Epoch:[1/2](5700/63764) loss:3.241 lr:0.0000010 epoch_Time:357.0min: [2023-12-11 14:54:05,776][model3_sft.py][INFO] Epoch:[1/2](5750/63764) loss:3.829 lr:0.0000010 epoch_Time:356.0min: [2023-12-11 14:54:24,136][model3_sft.py][INFO] Epoch:[1/2](5800/63764) loss:3.195 lr:0.0000010 epoch_Time:356.0min: [2023-12-11 14:54:42,538][model3_sft.py][INFO] Epoch:[1/2](5850/63764) loss:2.812 lr:0.0000010 epoch_Time:356.0min: [2023-12-11 14:55:01,019][model3_sft.py][INFO] Epoch:[1/2](5900/63764) loss:3.770 lr:0.0000010 epoch_Time:355.0min: [2023-12-11 14:55:19,430][model3_sft.py][INFO] Epoch:[1/2](5950/63764) loss:3.040 lr:0.0000010 epoch_Time:355.0min: [2023-12-11 14:55:37,846][model3_sft.py][INFO] Epoch:[1/2](6000/63764) loss:3.559 lr:0.0000010 epoch_Time:355.0min: [2023-12-11 14:55:56,245][model3_sft.py][INFO] Epoch:[1/2](6050/63764) loss:2.782 lr:0.0000010 epoch_Time:354.0min: [2023-12-11 14:56:14,634][model3_sft.py][INFO] Epoch:[1/2](6100/63764) loss:2.661 lr:0.0000010 epoch_Time:354.0min: [2023-12-11 14:56:33,030][model3_sft.py][INFO] Epoch:[1/2](6150/63764) loss:3.307 lr:0.0000010 epoch_Time:354.0min: [2023-12-11 14:56:51,427][model3_sft.py][INFO] Epoch:[1/2](6200/63764) loss:2.698 lr:0.0000010 epoch_Time:353.0min: [2023-12-11 14:57:09,862][model3_sft.py][INFO] Epoch:[1/2](6250/63764) loss:3.761 lr:0.0000010 epoch_Time:353.0min: [2023-12-11 14:57:28,247][model3_sft.py][INFO] Epoch:[1/2](6300/63764) loss:3.273 lr:0.0000010 epoch_Time:353.0min: [2023-12-11 14:57:46,641][model3_sft.py][INFO] Epoch:[1/2](6350/63764) loss:2.584 lr:0.0000010 epoch_Time:353.0min: [2023-12-11 14:58:05,081][model3_sft.py][INFO] Epoch:[1/2](6400/63764) loss:3.953 lr:0.0000010 epoch_Time:352.0min: [2023-12-11 14:58:23,692][model3_sft.py][INFO] Epoch:[1/2](6450/63764) loss:3.313 lr:0.0000010 epoch_Time:352.0min: [2023-12-11 14:58:42,106][model3_sft.py][INFO] Epoch:[1/2](6500/63764) loss:3.111 lr:0.0000010 epoch_Time:352.0min: [2023-12-11 14:59:00,495][model3_sft.py][INFO] Epoch:[1/2](6550/63764) loss:2.321 lr:0.0000010 epoch_Time:351.0min: [2023-12-11 14:59:18,913][model3_sft.py][INFO] Epoch:[1/2](6600/63764) loss:3.209 lr:0.0000010 epoch_Time:351.0min: [2023-12-11 14:59:37,328][model3_sft.py][INFO] Epoch:[1/2](6650/63764) loss:3.538 lr:0.0000010 epoch_Time:351.0min: [2023-12-11 14:59:55,706][model3_sft.py][INFO] Epoch:[1/2](6700/63764) loss:2.928 lr:0.0000010 epoch_Time:350.0min: [2023-12-11 15:00:14,116][model3_sft.py][INFO] Epoch:[1/2](6750/63764) loss:2.975 lr:0.0000010 epoch_Time:350.0min: [2023-12-11 15:00:32,530][model3_sft.py][INFO] Epoch:[1/2](6800/63764) loss:3.140 lr:0.0000010 epoch_Time:350.0min: [2023-12-11 15:00:50,925][model3_sft.py][INFO] Epoch:[1/2](6850/63764) loss:3.058 lr:0.0000010 epoch_Time:349.0min: [2023-12-11 15:01:09,341][model3_sft.py][INFO] Epoch:[1/2](6900/63764) loss:3.541 lr:0.0000010 epoch_Time:349.0min: [2023-12-11 15:01:27,744][model3_sft.py][INFO] Epoch:[1/2](6950/63764) loss:2.945 lr:0.0000010 epoch_Time:349.0min: [2023-12-11 15:01:46,160][model3_sft.py][INFO] Epoch:[1/2](7000/63764) loss:2.883 lr:0.0000010 epoch_Time:349.0min: [2023-12-11 15:02:04,542][model3_sft.py][INFO] Epoch:[1/2](7050/63764) loss:2.822 lr:0.0000010 epoch_Time:348.0min: [2023-12-11 15:02:22,941][model3_sft.py][INFO] Epoch:[1/2](7100/63764) loss:3.040 lr:0.0000010 epoch_Time:348.0min: [2023-12-11 15:02:41,338][model3_sft.py][INFO] Epoch:[1/2](7150/63764) loss:2.751 lr:0.0000010 epoch_Time:348.0min: [2023-12-11 15:02:59,768][model3_sft.py][INFO] Epoch:[1/2](7200/63764) loss:3.259 lr:0.0000010 epoch_Time:347.0min: [2023-12-11 15:03:18,160][model3_sft.py][INFO] Epoch:[1/2](7250/63764) loss:3.346 lr:0.0000010 epoch_Time:347.0min: [2023-12-11 15:03:36,557][model3_sft.py][INFO] Epoch:[1/2](7300/63764) loss:3.000 lr:0.0000010 epoch_Time:347.0min: [2023-12-11 15:03:54,977][model3_sft.py][INFO] Epoch:[1/2](7350/63764) loss:2.843 lr:0.0000010 epoch_Time:346.0min: [2023-12-11 15:04:13,614][model3_sft.py][INFO] Epoch:[1/2](7400/63764) loss:2.851 lr:0.0000010 epoch_Time:346.0min: [2023-12-11 15:04:32,000][model3_sft.py][INFO] Epoch:[1/2](7450/63764) loss:2.710 lr:0.0000010 epoch_Time:346.0min: [2023-12-11 15:04:50,382][model3_sft.py][INFO] Epoch:[1/2](7500/63764) loss:2.591 lr:0.0000010 epoch_Time:345.0min: [2023-12-11 15:05:08,786][model3_sft.py][INFO] Epoch:[1/2](7550/63764) loss:3.003 lr:0.0000010 epoch_Time:345.0min: [2023-12-11 15:05:27,199][model3_sft.py][INFO] Epoch:[1/2](7600/63764) loss:3.079 lr:0.0000010 epoch_Time:345.0min: [2023-12-11 15:05:45,568][model3_sft.py][INFO] Epoch:[1/2](7650/63764) loss:2.999 lr:0.0000010 epoch_Time:345.0min: [2023-12-11 15:06:03,963][model3_sft.py][INFO] Epoch:[1/2](7700/63764) loss:3.362 lr:0.0000010 epoch_Time:344.0min: [2023-12-11 15:06:22,380][model3_sft.py][INFO] Epoch:[1/2](7750/63764) loss:3.864 lr:0.0000010 epoch_Time:344.0min: [2023-12-11 15:06:40,805][model3_sft.py][INFO] Epoch:[1/2](7800/63764) loss:3.169 lr:0.0000010 epoch_Time:344.0min: [2023-12-11 15:06:59,204][model3_sft.py][INFO] Epoch:[1/2](7850/63764) loss:2.504 lr:0.0000010 epoch_Time:343.0min: [2023-12-11 15:07:17,581][model3_sft.py][INFO] Epoch:[1/2](7900/63764) loss:3.548 lr:0.0000010 epoch_Time:343.0min: [2023-12-11 15:07:36,008][model3_sft.py][INFO] Epoch:[1/2](7950/63764) loss:3.117 lr:0.0000010 epoch_Time:343.0min: [2023-12-11 15:07:54,388][model3_sft.py][INFO] Epoch:[1/2](8000/63764) loss:2.908 lr:0.0000010 epoch_Time:342.0min: [2023-12-11 15:08:12,771][model3_sft.py][INFO] Epoch:[1/2](8050/63764) loss:3.345 lr:0.0000010 epoch_Time:342.0min: [2023-12-11 15:08:31,183][model3_sft.py][INFO] Epoch:[1/2](8100/63764) loss:3.120 lr:0.0000010 epoch_Time:342.0min: [2023-12-11 15:08:49,593][model3_sft.py][INFO] Epoch:[1/2](8150/63764) loss:3.401 lr:0.0000010 epoch_Time:341.0min: [2023-12-11 15:09:07,980][model3_sft.py][INFO] Epoch:[1/2](8200/63764) loss:3.262 lr:0.0000010 epoch_Time:341.0min: [2023-12-11 15:09:26,370][model3_sft.py][INFO] Epoch:[1/2](8250/63764) loss:3.335 lr:0.0000010 epoch_Time:341.0min: [2023-12-11 15:09:44,833][model3_sft.py][INFO] Epoch:[1/2](8300/63764) loss:3.392 lr:0.0000010 epoch_Time:341.0min: [2023-12-11 15:10:03,268][model3_sft.py][INFO] Epoch:[1/2](8350/63764) loss:3.765 lr:0.0000010 epoch_Time:340.0min: [2023-12-11 15:10:21,836][model3_sft.py][INFO] Epoch:[1/2](8400/63764) loss:2.833 lr:0.0000010 epoch_Time:340.0min: [2023-12-11 15:10:40,207][model3_sft.py][INFO] Epoch:[1/2](8450/63764) loss:2.764 lr:0.0000010 epoch_Time:340.0min: [2023-12-11 15:10:58,613][model3_sft.py][INFO] Epoch:[1/2](8500/63764) loss:3.361 lr:0.0000010 epoch_Time:339.0min: [2023-12-11 15:11:17,070][model3_sft.py][INFO] Epoch:[1/2](8550/63764) loss:3.072 lr:0.0000010 epoch_Time:339.0min: [2023-12-11 15:11:35,450][model3_sft.py][INFO] Epoch:[1/2](8600/63764) loss:3.702 lr:0.0000010 epoch_Time:339.0min: [2023-12-11 15:11:53,864][model3_sft.py][INFO] Epoch:[1/2](8650/63764) loss:3.137 lr:0.0000010 epoch_Time:338.0min: [2023-12-11 15:12:12,290][model3_sft.py][INFO] Epoch:[1/2](8700/63764) loss:2.969 lr:0.0000010 epoch_Time:338.0min: [2023-12-11 15:12:30,704][model3_sft.py][INFO] Epoch:[1/2](8750/63764) loss:3.424 lr:0.0000010 epoch_Time:338.0min: [2023-12-11 15:12:49,129][model3_sft.py][INFO] Epoch:[1/2](8800/63764) loss:3.307 lr:0.0000010 epoch_Time:337.0min: [2023-12-11 15:13:07,513][model3_sft.py][INFO] Epoch:[1/2](8850/63764) loss:3.138 lr:0.0000010 epoch_Time:337.0min: [2023-12-11 15:13:25,933][model3_sft.py][INFO] Epoch:[1/2](8900/63764) loss:2.688 lr:0.0000010 epoch_Time:337.0min: [2023-12-11 15:13:44,322][model3_sft.py][INFO] Epoch:[1/2](8950/63764) loss:3.457 lr:0.0000010 epoch_Time:337.0min: [2023-12-11 15:14:02,719][model3_sft.py][INFO] Epoch:[1/2](9000/63764) loss:2.676 lr:0.0000010 epoch_Time:336.0min: [2023-12-11 15:14:21,126][model3_sft.py][INFO] Epoch:[1/2](9050/63764) loss:3.976 lr:0.0000010 epoch_Time:336.0min: [2023-12-11 15:14:39,514][model3_sft.py][INFO] Epoch:[1/2](9100/63764) loss:3.834 lr:0.0000010 epoch_Time:336.0min: [2023-12-11 15:14:57,917][model3_sft.py][INFO] Epoch:[1/2](9150/63764) loss:3.184 lr:0.0000010 epoch_Time:335.0min: [2023-12-11 15:15:16,382][model3_sft.py][INFO] Epoch:[1/2](9200/63764) loss:2.679 lr:0.0000010 epoch_Time:335.0min: [2023-12-11 15:15:34,754][model3_sft.py][INFO] Epoch:[1/2](9250/63764) loss:2.784 lr:0.0000010 epoch_Time:335.0min: [2023-12-11 15:15:53,172][model3_sft.py][INFO] Epoch:[1/2](9300/63764) loss:2.509 lr:0.0000010 epoch_Time:334.0min: [2023-12-11 15:16:11,833][model3_sft.py][INFO] Epoch:[1/2](9350/63764) loss:3.002 lr:0.0000010 epoch_Time:334.0min: [2023-12-11 15:16:30,244][model3_sft.py][INFO] Epoch:[1/2](9400/63764) loss:2.739 lr:0.0000010 epoch_Time:334.0min: [2023-12-11 15:16:48,680][model3_sft.py][INFO] Epoch:[1/2](9450/63764) loss:3.045 lr:0.0000010 epoch_Time:333.0min: [2023-12-11 15:17:07,097][model3_sft.py][INFO] Epoch:[1/2](9500/63764) loss:2.607 lr:0.0000010 epoch_Time:333.0min: [2023-12-11 15:17:25,471][model3_sft.py][INFO] Epoch:[1/2](9550/63764) loss:3.266 lr:0.0000010 epoch_Time:333.0min: [2023-12-11 15:17:43,899][model3_sft.py][INFO] Epoch:[1/2](9600/63764) loss:3.202 lr:0.0000010 epoch_Time:333.0min: [2023-12-11 15:18:02,298][model3_sft.py][INFO] Epoch:[1/2](9650/63764) loss:3.613 lr:0.0000010 epoch_Time:332.0min: [2023-12-11 15:18:20,704][model3_sft.py][INFO] Epoch:[1/2](9700/63764) loss:3.821 lr:0.0000010 epoch_Time:332.0min: [2023-12-11 15:18:39,084][model3_sft.py][INFO] Epoch:[1/2](9750/63764) loss:3.009 lr:0.0000010 epoch_Time:332.0min: [2023-12-11 15:18:57,463][model3_sft.py][INFO] Epoch:[1/2](9800/63764) loss:3.831 lr:0.0000010 epoch_Time:331.0min: [2023-12-11 15:19:15,895][model3_sft.py][INFO] Epoch:[1/2](9850/63764) loss:3.407 lr:0.0000010 epoch_Time:331.0min: [2023-12-11 15:19:34,352][model3_sft.py][INFO] Epoch:[1/2](9900/63764) loss:3.118 lr:0.0000010 epoch_Time:331.0min: [2023-12-11 15:19:52,762][model3_sft.py][INFO] Epoch:[1/2](9950/63764) loss:2.887 lr:0.0000010 epoch_Time:330.0min: [2023-12-11 15:20:11,158][model3_sft.py][INFO] Epoch:[1/2](10000/63764) loss:2.477 lr:0.0000010 epoch_Time:330.0min: [2023-12-11 15:20:29,557][model3_sft.py][INFO] Epoch:[1/2](10050/63764) loss:2.753 lr:0.0000010 epoch_Time:330.0min: [2023-12-11 15:20:47,951][model3_sft.py][INFO] Epoch:[1/2](10100/63764) loss:3.816 lr:0.0000010 epoch_Time:329.0min: [2023-12-11 15:21:06,340][model3_sft.py][INFO] Epoch:[1/2](10150/63764) loss:2.505 lr:0.0000010 epoch_Time:329.0min: [2023-12-11 15:21:24,775][model3_sft.py][INFO] Epoch:[1/2](10200/63764) loss:2.952 lr:0.0000010 epoch_Time:329.0min: [2023-12-11 15:21:43,261][model3_sft.py][INFO] Epoch:[1/2](10250/63764) loss:3.072 lr:0.0000010 epoch_Time:329.0min: [2023-12-11 15:22:01,904][model3_sft.py][INFO] Epoch:[1/2](10300/63764) loss:2.451 lr:0.0000010 epoch_Time:328.0min: [2023-12-11 15:22:20,318][model3_sft.py][INFO] Epoch:[1/2](10350/63764) loss:2.784 lr:0.0000010 epoch_Time:328.0min: [2023-12-11 15:22:38,753][model3_sft.py][INFO] Epoch:[1/2](10400/63764) loss:2.461 lr:0.0000010 epoch_Time:328.0min: [2023-12-11 15:22:57,212][model3_sft.py][INFO] Epoch:[1/2](10450/63764) loss:2.503 lr:0.0000010 epoch_Time:327.0min: [2023-12-11 15:23:15,616][model3_sft.py][INFO] Epoch:[1/2](10500/63764) loss:2.713 lr:0.0000010 epoch_Time:327.0min: [2023-12-11 15:23:33,988][model3_sft.py][INFO] Epoch:[1/2](10550/63764) loss:3.404 lr:0.0000010 epoch_Time:327.0min: [2023-12-11 15:23:52,357][model3_sft.py][INFO] Epoch:[1/2](10600/63764) loss:3.676 lr:0.0000010 epoch_Time:326.0min: [2023-12-11 15:24:10,726][model3_sft.py][INFO] Epoch:[1/2](10650/63764) loss:3.192 lr:0.0000010 epoch_Time:326.0min: [2023-12-11 15:24:29,169][model3_sft.py][INFO] Epoch:[1/2](10700/63764) loss:3.091 lr:0.0000010 epoch_Time:326.0min: [2023-12-11 15:24:47,554][model3_sft.py][INFO] Epoch:[1/2](10750/63764) loss:3.152 lr:0.0000010 epoch_Time:326.0min: [2023-12-11 15:25:05,954][model3_sft.py][INFO] Epoch:[1/2](10800/63764) loss:3.356 lr:0.0000010 epoch_Time:325.0min: [2023-12-11 15:25:24,310][model3_sft.py][INFO] Epoch:[1/2](10850/63764) loss:2.998 lr:0.0000010 epoch_Time:325.0min: [2023-12-11 15:25:42,660][model3_sft.py][INFO] Epoch:[1/2](10900/63764) loss:3.689 lr:0.0000010 epoch_Time:325.0min: [2023-12-11 15:26:01,036][model3_sft.py][INFO] Epoch:[1/2](10950/63764) loss:3.229 lr:0.0000010 epoch_Time:324.0min: [2023-12-11 15:26:19,443][model3_sft.py][INFO] Epoch:[1/2](11000/63764) loss:3.129 lr:0.0000010 epoch_Time:324.0min: [2023-12-11 15:26:37,832][model3_sft.py][INFO] Epoch:[1/2](11050/63764) loss:2.982 lr:0.0000010 epoch_Time:324.0min: [2023-12-11 15:26:56,232][model3_sft.py][INFO] Epoch:[1/2](11100/63764) loss:2.482 lr:0.0000010 epoch_Time:323.0min: [2023-12-11 15:27:14,641][model3_sft.py][INFO] Epoch:[1/2](11150/63764) loss:3.579 lr:0.0000010 epoch_Time:323.0min: [2023-12-11 15:27:33,012][model3_sft.py][INFO] Epoch:[1/2](11200/63764) loss:2.894 lr:0.0000010 epoch_Time:323.0min: [2023-12-11 15:27:51,622][model3_sft.py][INFO] Epoch:[1/2](11250/63764) loss:2.969 lr:0.0000010 epoch_Time:322.0min: [2023-12-11 15:28:10,037][model3_sft.py][INFO] Epoch:[1/2](11300/63764) loss:3.307 lr:0.0000010 epoch_Time:322.0min: [2023-12-11 15:28:28,449][model3_sft.py][INFO] Epoch:[1/2](11350/63764) loss:3.047 lr:0.0000010 epoch_Time:322.0min: [2023-12-11 15:28:46,841][model3_sft.py][INFO] Epoch:[1/2](11400/63764) loss:2.846 lr:0.0000010 epoch_Time:322.0min: [2023-12-11 15:29:05,220][model3_sft.py][INFO] Epoch:[1/2](11450/63764) loss:3.121 lr:0.0000010 epoch_Time:321.0min: [2023-12-11 15:29:23,618][model3_sft.py][INFO] Epoch:[1/2](11500/63764) loss:2.832 lr:0.0000010 epoch_Time:321.0min: [2023-12-11 15:29:42,008][model3_sft.py][INFO] Epoch:[1/2](11550/63764) loss:3.884 lr:0.0000010 epoch_Time:321.0min: [2023-12-11 15:30:00,423][model3_sft.py][INFO] Epoch:[1/2](11600/63764) loss:3.051 lr:0.0000010 epoch_Time:320.0min: [2023-12-11 15:30:18,823][model3_sft.py][INFO] Epoch:[1/2](11650/63764) loss:2.046 lr:0.0000010 epoch_Time:320.0min: [2023-12-11 15:30:37,264][model3_sft.py][INFO] Epoch:[1/2](11700/63764) loss:3.123 lr:0.0000010 epoch_Time:320.0min: [2023-12-11 15:30:55,637][model3_sft.py][INFO] Epoch:[1/2](11750/63764) loss:2.958 lr:0.0000010 epoch_Time:319.0min: [2023-12-11 15:31:14,127][model3_sft.py][INFO] Epoch:[1/2](11800/63764) loss:3.162 lr:0.0000010 epoch_Time:319.0min: [2023-12-11 15:31:32,499][model3_sft.py][INFO] Epoch:[1/2](11850/63764) loss:3.248 lr:0.0000010 epoch_Time:319.0min: [2023-12-11 15:31:50,872][model3_sft.py][INFO] Epoch:[1/2](11900/63764) loss:3.001 lr:0.0000010 epoch_Time:318.0min: [2023-12-11 15:32:09,282][model3_sft.py][INFO] Epoch:[1/2](11950/63764) loss:2.380 lr:0.0000010 epoch_Time:318.0min: [2023-12-11 15:32:27,684][model3_sft.py][INFO] Epoch:[1/2](12000/63764) loss:2.546 lr:0.0000010 epoch_Time:318.0min: [2023-12-11 15:32:46,123][model3_sft.py][INFO] Epoch:[1/2](12050/63764) loss:2.307 lr:0.0000010 epoch_Time:318.0min: [2023-12-11 15:33:04,508][model3_sft.py][INFO] Epoch:[1/2](12100/63764) loss:3.036 lr:0.0000010 epoch_Time:317.0min: [2023-12-11 15:33:22,901][model3_sft.py][INFO] Epoch:[1/2](12150/63764) loss:3.587 lr:0.0000010 epoch_Time:317.0min: [2023-12-11 15:33:41,534][model3_sft.py][INFO] Epoch:[1/2](12200/63764) loss:2.741 lr:0.0000010 epoch_Time:317.0min: [2023-12-11 15:33:59,937][model3_sft.py][INFO] Epoch:[1/2](12250/63764) loss:2.821 lr:0.0000010 epoch_Time:316.0min: [2023-12-11 15:34:18,385][model3_sft.py][INFO] Epoch:[1/2](12300/63764) loss:3.003 lr:0.0000010 epoch_Time:316.0min: [2023-12-11 15:34:36,743][model3_sft.py][INFO] Epoch:[1/2](12350/63764) loss:2.488 lr:0.0000010 epoch_Time:316.0min: [2023-12-11 15:34:55,159][model3_sft.py][INFO] Epoch:[1/2](12400/63764) loss:2.258 lr:0.0000010 epoch_Time:315.0min: [2023-12-11 15:35:13,562][model3_sft.py][INFO] Epoch:[1/2](12450/63764) loss:2.910 lr:0.0000010 epoch_Time:315.0min: [2023-12-11 15:35:32,003][model3_sft.py][INFO] Epoch:[1/2](12500/63764) loss:2.601 lr:0.0000010 epoch_Time:315.0min: [2023-12-11 15:35:50,387][model3_sft.py][INFO] Epoch:[1/2](12550/63764) loss:2.056 lr:0.0000010 epoch_Time:314.0min: [2023-12-11 15:36:08,777][model3_sft.py][INFO] Epoch:[1/2](12600/63764) loss:2.781 lr:0.0000010 epoch_Time:314.0min: [2023-12-11 15:36:27,129][model3_sft.py][INFO] Epoch:[1/2](12650/63764) loss:2.814 lr:0.0000010 epoch_Time:314.0min: [2023-12-11 15:36:45,543][model3_sft.py][INFO] Epoch:[1/2](12700/63764) loss:3.053 lr:0.0000010 epoch_Time:314.0min: [2023-12-11 15:37:03,955][model3_sft.py][INFO] Epoch:[1/2](12750/63764) loss:3.809 lr:0.0000010 epoch_Time:313.0min: [2023-12-11 15:37:22,371][model3_sft.py][INFO] Epoch:[1/2](12800/63764) loss:2.607 lr:0.0000010 epoch_Time:313.0min: [2023-12-11 15:37:40,792][model3_sft.py][INFO] Epoch:[1/2](12850/63764) loss:2.796 lr:0.0000010 epoch_Time:313.0min: [2023-12-11 15:37:59,202][model3_sft.py][INFO] Epoch:[1/2](12900/63764) loss:3.233 lr:0.0000010 epoch_Time:312.0min: [2023-12-11 15:38:17,609][model3_sft.py][INFO] Epoch:[1/2](12950/63764) loss:3.030 lr:0.0000010 epoch_Time:312.0min: [2023-12-11 15:38:36,039][model3_sft.py][INFO] Epoch:[1/2](13000/63764) loss:3.148 lr:0.0000010 epoch_Time:312.0min: [2023-12-11 15:38:54,437][model3_sft.py][INFO] Epoch:[1/2](13050/63764) loss:3.362 lr:0.0000010 epoch_Time:311.0min: [2023-12-11 15:39:12,871][model3_sft.py][INFO] Epoch:[1/2](13100/63764) loss:3.342 lr:0.0000010 epoch_Time:311.0min: [2023-12-11 15:39:31,267][model3_sft.py][INFO] Epoch:[1/2](13150/63764) loss:2.652 lr:0.0000010 epoch_Time:311.0min: [2023-12-11 15:39:49,875][model3_sft.py][INFO] Epoch:[1/2](13200/63764) loss:2.935 lr:0.0000010 epoch_Time:310.0min: [2023-12-11 15:40:08,272][model3_sft.py][INFO] Epoch:[1/2](13250/63764) loss:3.037 lr:0.0000010 epoch_Time:310.0min: [2023-12-11 15:40:26,679][model3_sft.py][INFO] Epoch:[1/2](13300/63764) loss:3.855 lr:0.0000010 epoch_Time:310.0min: [2023-12-11 15:40:45,069][model3_sft.py][INFO] Epoch:[1/2](13350/63764) loss:3.050 lr:0.0000010 epoch_Time:310.0min: [2023-12-11 15:41:03,453][model3_sft.py][INFO] Epoch:[1/2](13400/63764) loss:2.895 lr:0.0000010 epoch_Time:309.0min: [2023-12-11 15:41:21,844][model3_sft.py][INFO] Epoch:[1/2](13450/63764) loss:3.434 lr:0.0000010 epoch_Time:309.0min: [2023-12-11 15:41:40,200][model3_sft.py][INFO] Epoch:[1/2](13500/63764) loss:2.869 lr:0.0000010 epoch_Time:309.0min: [2023-12-11 15:41:58,611][model3_sft.py][INFO] Epoch:[1/2](13550/63764) loss:2.806 lr:0.0000010 epoch_Time:308.0min: [2023-12-11 15:42:17,055][model3_sft.py][INFO] Epoch:[1/2](13600/63764) loss:2.720 lr:0.0000010 epoch_Time:308.0min: [2023-12-11 15:42:35,508][model3_sft.py][INFO] Epoch:[1/2](13650/63764) loss:2.943 lr:0.0000010 epoch_Time:308.0min: [2023-12-11 15:42:53,914][model3_sft.py][INFO] Epoch:[1/2](13700/63764) loss:2.781 lr:0.0000010 epoch_Time:307.0min: [2023-12-11 15:43:12,288][model3_sft.py][INFO] Epoch:[1/2](13750/63764) loss:2.922 lr:0.0000010 epoch_Time:307.0min: [2023-12-11 15:43:30,709][model3_sft.py][INFO] Epoch:[1/2](13800/63764) loss:2.911 lr:0.0000010 epoch_Time:307.0min: [2023-12-11 15:43:49,139][model3_sft.py][INFO] Epoch:[1/2](13850/63764) loss:3.487 lr:0.0000010 epoch_Time:306.0min: [2023-12-11 15:44:07,529][model3_sft.py][INFO] Epoch:[1/2](13900/63764) loss:2.709 lr:0.0000010 epoch_Time:306.0min: [2023-12-11 15:44:25,969][model3_sft.py][INFO] Epoch:[1/2](13950/63764) loss:3.065 lr:0.0000010 epoch_Time:306.0min: [2023-12-11 15:44:44,390][model3_sft.py][INFO] Epoch:[1/2](14000/63764) loss:3.024 lr:0.0000010 epoch_Time:306.0min: [2023-12-11 15:45:02,843][model3_sft.py][INFO] Epoch:[1/2](14050/63764) loss:3.042 lr:0.0000010 epoch_Time:305.0min: [2023-12-11 15:45:21,223][model3_sft.py][INFO] Epoch:[1/2](14100/63764) loss:3.755 lr:0.0000010 epoch_Time:305.0min: [2023-12-11 15:45:39,856][model3_sft.py][INFO] Epoch:[1/2](14150/63764) loss:2.864 lr:0.0000010 epoch_Time:305.0min: [2023-12-11 15:45:58,361][model3_sft.py][INFO] Epoch:[1/2](14200/63764) loss:3.405 lr:0.0000010 epoch_Time:304.0min: [2023-12-11 15:46:16,933][model3_sft.py][INFO] Epoch:[1/2](14250/63764) loss:2.582 lr:0.0000010 epoch_Time:304.0min: [2023-12-11 15:46:35,435][model3_sft.py][INFO] Epoch:[1/2](14300/63764) loss:3.298 lr:0.0000010 epoch_Time:304.0min: [2023-12-11 15:46:53,846][model3_sft.py][INFO] Epoch:[1/2](14350/63764) loss:3.136 lr:0.0000010 epoch_Time:303.0min: [2023-12-11 15:47:12,288][model3_sft.py][INFO] Epoch:[1/2](14400/63764) loss:2.994 lr:0.0000010 epoch_Time:303.0min: [2023-12-11 15:47:30,746][model3_sft.py][INFO] Epoch:[1/2](14450/63764) loss:3.097 lr:0.0000010 epoch_Time:303.0min: [2023-12-11 15:47:49,169][model3_sft.py][INFO] Epoch:[1/2](14500/63764) loss:2.506 lr:0.0000010 epoch_Time:302.0min: [2023-12-11 15:48:07,623][model3_sft.py][INFO] Epoch:[1/2](14550/63764) loss:3.154 lr:0.0000010 epoch_Time:302.0min: [2023-12-11 15:48:26,083][model3_sft.py][INFO] Epoch:[1/2](14600/63764) loss:3.148 lr:0.0000010 epoch_Time:302.0min: [2023-12-11 15:48:44,546][model3_sft.py][INFO] Epoch:[1/2](14650/63764) loss:3.529 lr:0.0000010 epoch_Time:302.0min: [2023-12-11 15:49:02,992][model3_sft.py][INFO] Epoch:[1/2](14700/63764) loss:3.497 lr:0.0000010 epoch_Time:301.0min: [2023-12-11 15:49:21,425][model3_sft.py][INFO] Epoch:[1/2](14750/63764) loss:2.770 lr:0.0000010 epoch_Time:301.0min: [2023-12-11 15:49:39,912][model3_sft.py][INFO] Epoch:[1/2](14800/63764) loss:2.869 lr:0.0000010 epoch_Time:301.0min: [2023-12-11 15:49:58,355][model3_sft.py][INFO] Epoch:[1/2](14850/63764) loss:3.023 lr:0.0000010 epoch_Time:300.0min: [2023-12-11 15:50:16,784][model3_sft.py][INFO] Epoch:[1/2](14900/63764) loss:3.106 lr:0.0000010 epoch_Time:300.0min: [2023-12-11 15:50:35,255][model3_sft.py][INFO] Epoch:[1/2](14950/63764) loss:3.609 lr:0.0000010 epoch_Time:300.0min: [2023-12-11 15:50:53,776][model3_sft.py][INFO] Epoch:[1/2](15000/63764) loss:2.928 lr:0.0000010 epoch_Time:299.0min: [2023-12-11 15:51:12,255][model3_sft.py][INFO] Epoch:[1/2](15050/63764) loss:2.978 lr:0.0000010 epoch_Time:299.0min: [2023-12-11 15:51:30,886][model3_sft.py][INFO] Epoch:[1/2](15100/63764) loss:2.919 lr:0.0000010 epoch_Time:299.0min: [2023-12-11 15:51:49,360][model3_sft.py][INFO] Epoch:[1/2](15150/63764) loss:2.538 lr:0.0000010 epoch_Time:298.0min: [2023-12-11 15:52:07,831][model3_sft.py][INFO] Epoch:[1/2](15200/63764) loss:2.580 lr:0.0000010 epoch_Time:298.0min: [2023-12-11 15:52:26,300][model3_sft.py][INFO] Epoch:[1/2](15250/63764) loss:2.964 lr:0.0000010 epoch_Time:298.0min: [2023-12-11 15:52:44,749][model3_sft.py][INFO] Epoch:[1/2](15300/63764) loss:2.426 lr:0.0000010 epoch_Time:298.0min: [2023-12-11 15:53:03,206][model3_sft.py][INFO] Epoch:[1/2](15350/63764) loss:3.576 lr:0.0000010 epoch_Time:297.0min: [2023-12-11 15:53:21,665][model3_sft.py][INFO] Epoch:[1/2](15400/63764) loss:1.860 lr:0.0000010 epoch_Time:297.0min: [2023-12-11 15:53:40,155][model3_sft.py][INFO] Epoch:[1/2](15450/63764) loss:3.066 lr:0.0000010 epoch_Time:297.0min: [2023-12-11 15:53:58,595][model3_sft.py][INFO] Epoch:[1/2](15500/63764) loss:2.515 lr:0.0000010 epoch_Time:296.0min: [2023-12-11 15:54:17,101][model3_sft.py][INFO] Epoch:[1/2](15550/63764) loss:2.558 lr:0.0000010 epoch_Time:296.0min: [2023-12-11 15:54:35,624][model3_sft.py][INFO] Epoch:[1/2](15600/63764) loss:2.882 lr:0.0000010 epoch_Time:296.0min: [2023-12-11 15:54:54,109][model3_sft.py][INFO] Epoch:[1/2](15650/63764) loss:3.342 lr:0.0000010 epoch_Time:295.0min: [2023-12-11 15:55:12,562][model3_sft.py][INFO] Epoch:[1/2](15700/63764) loss:3.580 lr:0.0000010 epoch_Time:295.0min: [2023-12-11 15:55:31,033][model3_sft.py][INFO] Epoch:[1/2](15750/63764) loss:3.991 lr:0.0000010 epoch_Time:295.0min: [2023-12-11 15:55:49,567][model3_sft.py][INFO] Epoch:[1/2](15800/63764) loss:2.636 lr:0.0000010 epoch_Time:294.0min: [2023-12-11 15:56:08,069][model3_sft.py][INFO] Epoch:[1/2](15850/63764) loss:2.923 lr:0.0000010 epoch_Time:294.0min: [2023-12-11 15:56:26,509][model3_sft.py][INFO] Epoch:[1/2](15900/63764) loss:2.828 lr:0.0000010 epoch_Time:294.0min: [2023-12-11 15:56:44,946][model3_sft.py][INFO] Epoch:[1/2](15950/63764) loss:2.719 lr:0.0000010 epoch_Time:294.0min: [2023-12-11 15:57:03,392][model3_sft.py][INFO] Epoch:[1/2](16000/63764) loss:2.869 lr:0.0000010 epoch_Time:293.0min: [2023-12-11 15:57:22,059][model3_sft.py][INFO] Epoch:[1/2](16050/63764) loss:3.303 lr:0.0000010 epoch_Time:293.0min: [2023-12-11 15:57:40,470][model3_sft.py][INFO] Epoch:[1/2](16100/63764) loss:2.212 lr:0.0000010 epoch_Time:293.0min: [2023-12-11 15:57:58,920][model3_sft.py][INFO] Epoch:[1/2](16150/63764) loss:3.111 lr:0.0000010 epoch_Time:292.0min: [2023-12-11 15:58:17,393][model3_sft.py][INFO] Epoch:[1/2](16200/63764) loss:2.849 lr:0.0000010 epoch_Time:292.0min: [2023-12-11 15:58:35,875][model3_sft.py][INFO] Epoch:[1/2](16250/63764) loss:3.570 lr:0.0000010 epoch_Time:292.0min: [2023-12-11 15:58:54,338][model3_sft.py][INFO] Epoch:[1/2](16300/63764) loss:2.996 lr:0.0000010 epoch_Time:291.0min: [2023-12-11 15:59:12,854][model3_sft.py][INFO] Epoch:[1/2](16350/63764) loss:2.759 lr:0.0000010 epoch_Time:291.0min: [2023-12-11 15:59:31,288][model3_sft.py][INFO] Epoch:[1/2](16400/63764) loss:2.592 lr:0.0000010 epoch_Time:291.0min: [2023-12-11 15:59:49,731][model3_sft.py][INFO] Epoch:[1/2](16450/63764) loss:3.242 lr:0.0000010 epoch_Time:290.0min: [2023-12-11 16:00:08,148][model3_sft.py][INFO] Epoch:[1/2](16500/63764) loss:3.233 lr:0.0000010 epoch_Time:290.0min: [2023-12-11 16:00:26,624][model3_sft.py][INFO] Epoch:[1/2](16550/63764) loss:2.600 lr:0.0000010 epoch_Time:290.0min: [2023-12-11 16:00:45,095][model3_sft.py][INFO] Epoch:[1/2](16600/63764) loss:3.248 lr:0.0000010 epoch_Time:290.0min: [2023-12-11 16:01:03,525][model3_sft.py][INFO] Epoch:[1/2](16650/63764) loss:3.280 lr:0.0000010 epoch_Time:289.0min: [2023-12-11 16:01:21,966][model3_sft.py][INFO] Epoch:[1/2](16700/63764) loss:2.560 lr:0.0000010 epoch_Time:289.0min: [2023-12-11 16:01:40,474][model3_sft.py][INFO] Epoch:[1/2](16750/63764) loss:3.063 lr:0.0000010 epoch_Time:289.0min: [2023-12-11 16:01:58,936][model3_sft.py][INFO] Epoch:[1/2](16800/63764) loss:3.209 lr:0.0000010 epoch_Time:288.0min: [2023-12-11 16:02:17,374][model3_sft.py][INFO] Epoch:[1/2](16850/63764) loss:3.326 lr:0.0000010 epoch_Time:288.0min: [2023-12-11 16:02:35,861][model3_sft.py][INFO] Epoch:[1/2](16900/63764) loss:3.192 lr:0.0000010 epoch_Time:288.0min: [2023-12-11 16:02:54,346][model3_sft.py][INFO] Epoch:[1/2](16950/63764) loss:2.353 lr:0.0000010 epoch_Time:287.0min: [2023-12-11 16:03:13,004][model3_sft.py][INFO] Epoch:[1/2](17000/63764) loss:3.809 lr:0.0000010 epoch_Time:287.0min: [2023-12-11 16:03:31,427][model3_sft.py][INFO] Epoch:[1/2](17050/63764) loss:2.894 lr:0.0000010 epoch_Time:287.0min: [2023-12-11 16:03:49,884][model3_sft.py][INFO] Epoch:[1/2](17100/63764) loss:3.176 lr:0.0000010 epoch_Time:286.0min: [2023-12-11 16:04:08,338][model3_sft.py][INFO] Epoch:[1/2](17150/63764) loss:3.297 lr:0.0000010 epoch_Time:286.0min: [2023-12-11 16:04:26,764][model3_sft.py][INFO] Epoch:[1/2](17200/63764) loss:2.678 lr:0.0000010 epoch_Time:286.0min: [2023-12-11 16:04:45,235][model3_sft.py][INFO] Epoch:[1/2](17250/63764) loss:2.949 lr:0.0000010 epoch_Time:286.0min: [2023-12-11 16:05:03,701][model3_sft.py][INFO] Epoch:[1/2](17300/63764) loss:3.247 lr:0.0000010 epoch_Time:285.0min: [2023-12-11 16:05:22,141][model3_sft.py][INFO] Epoch:[1/2](17350/63764) loss:2.997 lr:0.0000010 epoch_Time:285.0min: [2023-12-11 16:05:40,612][model3_sft.py][INFO] Epoch:[1/2](17400/63764) loss:2.342 lr:0.0000010 epoch_Time:285.0min: [2023-12-11 16:05:59,051][model3_sft.py][INFO] Epoch:[1/2](17450/63764) loss:3.522 lr:0.0000010 epoch_Time:284.0min: [2023-12-11 16:06:17,479][model3_sft.py][INFO] Epoch:[1/2](17500/63764) loss:3.222 lr:0.0000010 epoch_Time:284.0min: [2023-12-11 16:06:35,975][model3_sft.py][INFO] Epoch:[1/2](17550/63764) loss:2.614 lr:0.0000010 epoch_Time:284.0min: [2023-12-11 16:06:54,430][model3_sft.py][INFO] Epoch:[1/2](17600/63764) loss:2.662 lr:0.0000010 epoch_Time:283.0min: [2023-12-11 16:07:12,893][model3_sft.py][INFO] Epoch:[1/2](17650/63764) loss:3.330 lr:0.0000010 epoch_Time:283.0min: [2023-12-11 16:07:31,368][model3_sft.py][INFO] Epoch:[1/2](17700/63764) loss:3.652 lr:0.0000010 epoch_Time:283.0min: [2023-12-11 16:07:49,835][model3_sft.py][INFO] Epoch:[1/2](17750/63764) loss:2.969 lr:0.0000010 epoch_Time:282.0min: [2023-12-11 16:08:08,254][model3_sft.py][INFO] Epoch:[1/2](17800/63764) loss:2.922 lr:0.0000010 epoch_Time:282.0min: [2023-12-11 16:08:26,733][model3_sft.py][INFO] Epoch:[1/2](17850/63764) loss:3.505 lr:0.0000010 epoch_Time:282.0min: [2023-12-11 16:08:45,155][model3_sft.py][INFO] Epoch:[1/2](17900/63764) loss:3.127 lr:0.0000010 epoch_Time:282.0min: [2023-12-11 16:09:03,588][model3_sft.py][INFO] Epoch:[1/2](17950/63764) loss:2.426 lr:0.0000010 epoch_Time:281.0min: [2023-12-11 16:09:22,201][model3_sft.py][INFO] Epoch:[1/2](18000/63764) loss:3.227 lr:0.0000010 epoch_Time:281.0min: [2023-12-11 16:09:40,669][model3_sft.py][INFO] Epoch:[1/2](18050/63764) loss:2.880 lr:0.0000010 epoch_Time:281.0min: [2023-12-11 16:09:59,114][model3_sft.py][INFO] Epoch:[1/2](18100/63764) loss:2.589 lr:0.0000010 epoch_Time:280.0min: [2023-12-11 16:10:17,604][model3_sft.py][INFO] Epoch:[1/2](18150/63764) loss:2.739 lr:0.0000010 epoch_Time:280.0min: [2023-12-11 16:10:36,062][model3_sft.py][INFO] Epoch:[1/2](18200/63764) loss:2.967 lr:0.0000010 epoch_Time:280.0min: [2023-12-11 16:10:54,499][model3_sft.py][INFO] Epoch:[1/2](18250/63764) loss:3.179 lr:0.0000010 epoch_Time:279.0min: [2023-12-11 16:11:12,978][model3_sft.py][INFO] Epoch:[1/2](18300/63764) loss:3.507 lr:0.0000010 epoch_Time:279.0min: [2023-12-11 16:11:31,421][model3_sft.py][INFO] Epoch:[1/2](18350/63764) loss:2.444 lr:0.0000010 epoch_Time:279.0min: [2023-12-11 16:11:49,930][model3_sft.py][INFO] Epoch:[1/2](18400/63764) loss:3.079 lr:0.0000010 epoch_Time:278.0min: [2023-12-11 16:12:08,369][model3_sft.py][INFO] Epoch:[1/2](18450/63764) loss:2.994 lr:0.0000010 epoch_Time:278.0min: [2023-12-11 16:12:26,795][model3_sft.py][INFO] Epoch:[1/2](18500/63764) loss:4.038 lr:0.0000010 epoch_Time:278.0min: [2023-12-11 16:12:45,268][model3_sft.py][INFO] Epoch:[1/2](18550/63764) loss:3.557 lr:0.0000010 epoch_Time:278.0min: [2023-12-11 16:13:03,742][model3_sft.py][INFO] Epoch:[1/2](18600/63764) loss:2.785 lr:0.0000010 epoch_Time:277.0min: [2023-12-11 16:13:22,176][model3_sft.py][INFO] Epoch:[1/2](18650/63764) loss:3.281 lr:0.0000010 epoch_Time:277.0min: [2023-12-11 16:13:40,656][model3_sft.py][INFO] Epoch:[1/2](18700/63764) loss:3.012 lr:0.0000010 epoch_Time:277.0min: [2023-12-11 16:13:59,139][model3_sft.py][INFO] Epoch:[1/2](18750/63764) loss:3.042 lr:0.0000010 epoch_Time:276.0min: [2023-12-11 16:14:17,598][model3_sft.py][INFO] Epoch:[1/2](18800/63764) loss:2.812 lr:0.0000010 epoch_Time:276.0min: [2023-12-11 16:14:36,062][model3_sft.py][INFO] Epoch:[1/2](18850/63764) loss:2.730 lr:0.0000010 epoch_Time:276.0min: [2023-12-11 16:14:54,539][model3_sft.py][INFO] Epoch:[1/2](18900/63764) loss:2.525 lr:0.0000010 epoch_Time:275.0min: [2023-12-11 16:15:13,214][model3_sft.py][INFO] Epoch:[1/2](18950/63764) loss:3.591 lr:0.0000010 epoch_Time:275.0min: [2023-12-11 16:15:31,650][model3_sft.py][INFO] Epoch:[1/2](19000/63764) loss:2.346 lr:0.0000010 epoch_Time:275.0min: [2023-12-11 16:15:50,055][model3_sft.py][INFO] Epoch:[1/2](19050/63764) loss:3.397 lr:0.0000010 epoch_Time:274.0min: [2023-12-11 16:16:08,539][model3_sft.py][INFO] Epoch:[1/2](19100/63764) loss:2.821 lr:0.0000010 epoch_Time:274.0min: [2023-12-11 16:16:27,020][model3_sft.py][INFO] Epoch:[1/2](19150/63764) loss:2.938 lr:0.0000010 epoch_Time:274.0min: [2023-12-11 16:16:45,476][model3_sft.py][INFO] Epoch:[1/2](19200/63764) loss:2.759 lr:0.0000010 epoch_Time:274.0min: [2023-12-11 16:17:03,943][model3_sft.py][INFO] Epoch:[1/2](19250/63764) loss:3.169 lr:0.0000010 epoch_Time:273.0min: [2023-12-11 16:17:22,379][model3_sft.py][INFO] Epoch:[1/2](19300/63764) loss:3.648 lr:0.0000010 epoch_Time:273.0min: [2023-12-11 16:17:40,832][model3_sft.py][INFO] Epoch:[1/2](19350/63764) loss:3.283 lr:0.0000010 epoch_Time:273.0min: [2023-12-11 16:17:59,285][model3_sft.py][INFO] Epoch:[1/2](19400/63764) loss:3.195 lr:0.0000010 epoch_Time:272.0min: [2023-12-11 16:18:17,730][model3_sft.py][INFO] Epoch:[1/2](19450/63764) loss:3.321 lr:0.0000010 epoch_Time:272.0min: [2023-12-11 16:18:36,201][model3_sft.py][INFO] Epoch:[1/2](19500/63764) loss:3.342 lr:0.0000010 epoch_Time:272.0min: [2023-12-11 16:18:54,673][model3_sft.py][INFO] Epoch:[1/2](19550/63764) loss:3.074 lr:0.0000010 epoch_Time:271.0min: [2023-12-11 16:19:13,115][model3_sft.py][INFO] Epoch:[1/2](19600/63764) loss:2.882 lr:0.0000010 epoch_Time:271.0min: [2023-12-11 16:19:31,570][model3_sft.py][INFO] Epoch:[1/2](19650/63764) loss:2.510 lr:0.0000010 epoch_Time:271.0min: [2023-12-11 16:19:50,024][model3_sft.py][INFO] Epoch:[1/2](19700/63764) loss:3.071 lr:0.0000010 epoch_Time:270.0min: [2023-12-11 16:20:08,479][model3_sft.py][INFO] Epoch:[1/2](19750/63764) loss:3.158 lr:0.0000010 epoch_Time:270.0min: [2023-12-11 16:20:26,967][model3_sft.py][INFO] Epoch:[1/2](19800/63764) loss:3.832 lr:0.0000010 epoch_Time:270.0min: [2023-12-11 16:20:45,412][model3_sft.py][INFO] Epoch:[1/2](19850/63764) loss:2.991 lr:0.0000010 epoch_Time:270.0min: [2023-12-11 16:21:04,106][model3_sft.py][INFO] Epoch:[1/2](19900/63764) loss:2.546 lr:0.0000010 epoch_Time:269.0min: [2023-12-11 16:21:22,545][model3_sft.py][INFO] Epoch:[1/2](19950/63764) loss:3.005 lr:0.0000010 epoch_Time:269.0min: [2023-12-11 16:21:41,008][model3_sft.py][INFO] Epoch:[1/2](20000/63764) loss:2.840 lr:0.0000010 epoch_Time:269.0min: [2023-12-11 16:21:59,499][model3_sft.py][INFO] Epoch:[1/2](20050/63764) loss:2.766 lr:0.0000010 epoch_Time:268.0min: [2023-12-11 16:22:17,978][model3_sft.py][INFO] Epoch:[1/2](20100/63764) loss:2.801 lr:0.0000010 epoch_Time:268.0min: [2023-12-11 16:22:36,425][model3_sft.py][INFO] Epoch:[1/2](20150/63764) loss:2.982 lr:0.0000010 epoch_Time:268.0min: [2023-12-11 16:22:54,889][model3_sft.py][INFO] Epoch:[1/2](20200/63764) loss:3.063 lr:0.0000010 epoch_Time:267.0min: [2023-12-11 16:23:13,366][model3_sft.py][INFO] Epoch:[1/2](20250/63764) loss:3.390 lr:0.0000010 epoch_Time:267.0min: [2023-12-11 16:23:31,846][model3_sft.py][INFO] Epoch:[1/2](20300/63764) loss:2.655 lr:0.0000010 epoch_Time:267.0min: [2023-12-11 16:23:50,314][model3_sft.py][INFO] Epoch:[1/2](20350/63764) loss:2.945 lr:0.0000010 epoch_Time:266.0min: [2023-12-11 16:24:08,781][model3_sft.py][INFO] Epoch:[1/2](20400/63764) loss:3.288 lr:0.0000010 epoch_Time:266.0min: [2023-12-11 16:24:27,221][model3_sft.py][INFO] Epoch:[1/2](20450/63764) loss:3.038 lr:0.0000010 epoch_Time:266.0min: [2023-12-11 16:24:45,718][model3_sft.py][INFO] Epoch:[1/2](20500/63764) loss:3.061 lr:0.0000010 epoch_Time:266.0min: [2023-12-11 16:25:04,167][model3_sft.py][INFO] Epoch:[1/2](20550/63764) loss:3.810 lr:0.0000010 epoch_Time:265.0min: [2023-12-11 16:25:22,661][model3_sft.py][INFO] Epoch:[1/2](20600/63764) loss:2.277 lr:0.0000010 epoch_Time:265.0min: [2023-12-11 16:25:41,168][model3_sft.py][INFO] Epoch:[1/2](20650/63764) loss:2.933 lr:0.0000010 epoch_Time:265.0min: [2023-12-11 16:25:59,654][model3_sft.py][INFO] Epoch:[1/2](20700/63764) loss:2.812 lr:0.0000010 epoch_Time:264.0min: [2023-12-11 16:26:18,140][model3_sft.py][INFO] Epoch:[1/2](20750/63764) loss:3.506 lr:0.0000010 epoch_Time:264.0min: [2023-12-11 16:26:36,624][model3_sft.py][INFO] Epoch:[1/2](20800/63764) loss:2.632 lr:0.0000010 epoch_Time:264.0min: [2023-12-11 16:26:55,383][model3_sft.py][INFO] Epoch:[1/2](20850/63764) loss:2.405 lr:0.0000010 epoch_Time:263.0min: [2023-12-11 16:27:13,829][model3_sft.py][INFO] Epoch:[1/2](20900/63764) loss:3.510 lr:0.0000010 epoch_Time:263.0min: [2023-12-11 16:27:32,328][model3_sft.py][INFO] Epoch:[1/2](20950/63764) loss:3.296 lr:0.0000010 epoch_Time:263.0min: [2023-12-11 16:27:50,813][model3_sft.py][INFO] Epoch:[1/2](21000/63764) loss:2.699 lr:0.0000010 epoch_Time:262.0min: [2023-12-11 16:28:09,298][model3_sft.py][INFO] Epoch:[1/2](21050/63764) loss:3.545 lr:0.0000010 epoch_Time:262.0min: [2023-12-11 16:28:27,748][model3_sft.py][INFO] Epoch:[1/2](21100/63764) loss:3.566 lr:0.0000010 epoch_Time:262.0min: [2023-12-11 16:28:46,221][model3_sft.py][INFO] Epoch:[1/2](21150/63764) loss:2.458 lr:0.0000010 epoch_Time:262.0min: [2023-12-11 16:29:04,677][model3_sft.py][INFO] Epoch:[1/2](21200/63764) loss:3.124 lr:0.0000010 epoch_Time:261.0min: [2023-12-11 16:29:23,150][model3_sft.py][INFO] Epoch:[1/2](21250/63764) loss:3.258 lr:0.0000010 epoch_Time:261.0min: [2023-12-11 16:29:41,684][model3_sft.py][INFO] Epoch:[1/2](21300/63764) loss:3.129 lr:0.0000010 epoch_Time:261.0min: [2023-12-11 16:30:00,142][model3_sft.py][INFO] Epoch:[1/2](21350/63764) loss:2.773 lr:0.0000010 epoch_Time:260.0min: [2023-12-11 16:30:18,616][model3_sft.py][INFO] Epoch:[1/2](21400/63764) loss:2.429 lr:0.0000010 epoch_Time:260.0min: [2023-12-11 16:30:37,162][model3_sft.py][INFO] Epoch:[1/2](21450/63764) loss:2.671 lr:0.0000010 epoch_Time:260.0min: [2023-12-11 16:30:55,621][model3_sft.py][INFO] Epoch:[1/2](21500/63764) loss:3.081 lr:0.0000010 epoch_Time:259.0min: [2023-12-11 16:31:14,092][model3_sft.py][INFO] Epoch:[1/2](21550/63764) loss:2.535 lr:0.0000010 epoch_Time:259.0min: [2023-12-11 16:31:32,560][model3_sft.py][INFO] Epoch:[1/2](21600/63764) loss:3.451 lr:0.0000010 epoch_Time:259.0min: [2023-12-11 16:31:51,057][model3_sft.py][INFO] Epoch:[1/2](21650/63764) loss:3.085 lr:0.0000010 epoch_Time:258.0min: [2023-12-11 16:32:09,519][model3_sft.py][INFO] Epoch:[1/2](21700/63764) loss:2.419 lr:0.0000010 epoch_Time:258.0min: [2023-12-11 16:32:27,944][model3_sft.py][INFO] Epoch:[1/2](21750/63764) loss:3.430 lr:0.0000010 epoch_Time:258.0min: [2023-12-11 16:32:46,659][model3_sft.py][INFO] Epoch:[1/2](21800/63764) loss:3.750 lr:0.0000010 epoch_Time:258.0min: [2023-12-11 16:33:05,115][model3_sft.py][INFO] Epoch:[1/2](21850/63764) loss:2.729 lr:0.0000010 epoch_Time:257.0min: [2023-12-11 16:33:23,587][model3_sft.py][INFO] Epoch:[1/2](21900/63764) loss:3.335 lr:0.0000010 epoch_Time:257.0min: [2023-12-11 16:33:42,039][model3_sft.py][INFO] Epoch:[1/2](21950/63764) loss:2.812 lr:0.0000010 epoch_Time:257.0min: [2023-12-11 16:34:00,514][model3_sft.py][INFO] Epoch:[1/2](22000/63764) loss:3.289 lr:0.0000010 epoch_Time:256.0min: [2023-12-11 16:34:18,945][model3_sft.py][INFO] Epoch:[1/2](22050/63764) loss:3.160 lr:0.0000010 epoch_Time:256.0min: [2023-12-11 16:34:37,477][model3_sft.py][INFO] Epoch:[1/2](22100/63764) loss:3.386 lr:0.0000010 epoch_Time:256.0min: [2023-12-11 16:34:55,945][model3_sft.py][INFO] Epoch:[1/2](22150/63764) loss:2.939 lr:0.0000010 epoch_Time:255.0min: [2023-12-11 16:35:14,397][model3_sft.py][INFO] Epoch:[1/2](22200/63764) loss:2.999 lr:0.0000010 epoch_Time:255.0min: [2023-12-11 16:35:32,859][model3_sft.py][INFO] Epoch:[1/2](22250/63764) loss:2.155 lr:0.0000010 epoch_Time:255.0min: [2023-12-11 16:35:51,328][model3_sft.py][INFO] Epoch:[1/2](22300/63764) loss:2.715 lr:0.0000010 epoch_Time:254.0min: [2023-12-11 16:36:09,771][model3_sft.py][INFO] Epoch:[1/2](22350/63764) loss:2.649 lr:0.0000010 epoch_Time:254.0min: [2023-12-11 16:36:28,223][model3_sft.py][INFO] Epoch:[1/2](22400/63764) loss:2.588 lr:0.0000010 epoch_Time:254.0min: [2023-12-11 16:36:46,733][model3_sft.py][INFO] Epoch:[1/2](22450/63764) loss:2.915 lr:0.0000010 epoch_Time:254.0min: [2023-12-11 16:37:05,197][model3_sft.py][INFO] Epoch:[1/2](22500/63764) loss:3.325 lr:0.0000010 epoch_Time:253.0min: [2023-12-11 16:37:23,722][model3_sft.py][INFO] Epoch:[1/2](22550/63764) loss:2.616 lr:0.0000010 epoch_Time:253.0min: [2023-12-11 16:37:42,230][model3_sft.py][INFO] Epoch:[1/2](22600/63764) loss:3.144 lr:0.0000010 epoch_Time:253.0min: [2023-12-11 16:38:00,703][model3_sft.py][INFO] Epoch:[1/2](22650/63764) loss:2.981 lr:0.0000010 epoch_Time:252.0min: [2023-12-11 16:38:19,168][model3_sft.py][INFO] Epoch:[1/2](22700/63764) loss:2.686 lr:0.0000010 epoch_Time:252.0min: [2023-12-11 16:38:37,677][model3_sft.py][INFO] Epoch:[1/2](22750/63764) loss:2.768 lr:0.0000010 epoch_Time:252.0min: [2023-12-11 16:38:56,374][model3_sft.py][INFO] Epoch:[1/2](22800/63764) loss:3.241 lr:0.0000010 epoch_Time:251.0min: [2023-12-11 16:39:14,848][model3_sft.py][INFO] Epoch:[1/2](22850/63764) loss:2.724 lr:0.0000010 epoch_Time:251.0min: [2023-12-11 16:39:33,289][model3_sft.py][INFO] Epoch:[1/2](22900/63764) loss:2.521 lr:0.0000010 epoch_Time:251.0min: [2023-12-11 16:39:51,780][model3_sft.py][INFO] Epoch:[1/2](22950/63764) loss:3.421 lr:0.0000010 epoch_Time:250.0min: [2023-12-11 16:40:10,239][model3_sft.py][INFO] Epoch:[1/2](23000/63764) loss:2.434 lr:0.0000010 epoch_Time:250.0min: [2023-12-11 16:40:28,683][model3_sft.py][INFO] Epoch:[1/2](23050/63764) loss:3.485 lr:0.0000010 epoch_Time:250.0min: [2023-12-11 16:40:47,145][model3_sft.py][INFO] Epoch:[1/2](23100/63764) loss:2.574 lr:0.0000010 epoch_Time:250.0min: [2023-12-11 16:41:05,648][model3_sft.py][INFO] Epoch:[1/2](23150/63764) loss:2.365 lr:0.0000010 epoch_Time:249.0min: [2023-12-11 16:41:24,116][model3_sft.py][INFO] Epoch:[1/2](23200/63764) loss:2.838 lr:0.0000010 epoch_Time:249.0min: [2023-12-11 16:41:42,615][model3_sft.py][INFO] Epoch:[1/2](23250/63764) loss:2.467 lr:0.0000010 epoch_Time:249.0min: [2023-12-11 16:42:01,112][model3_sft.py][INFO] Epoch:[1/2](23300/63764) loss:3.911 lr:0.0000010 epoch_Time:248.0min: [2023-12-11 16:42:19,627][model3_sft.py][INFO] Epoch:[1/2](23350/63764) loss:2.674 lr:0.0000010 epoch_Time:248.0min: [2023-12-11 16:42:38,080][model3_sft.py][INFO] Epoch:[1/2](23400/63764) loss:3.509 lr:0.0000010 epoch_Time:248.0min: [2023-12-11 16:42:56,581][model3_sft.py][INFO] Epoch:[1/2](23450/63764) loss:2.545 lr:0.0000010 epoch_Time:247.0min: [2023-12-11 16:43:15,056][model3_sft.py][INFO] Epoch:[1/2](23500/63764) loss:2.626 lr:0.0000010 epoch_Time:247.0min: [2023-12-11 16:43:33,528][model3_sft.py][INFO] Epoch:[1/2](23550/63764) loss:2.489 lr:0.0000010 epoch_Time:247.0min: [2023-12-11 16:43:52,003][model3_sft.py][INFO] Epoch:[1/2](23600/63764) loss:4.425 lr:0.0000010 epoch_Time:246.0min: [2023-12-11 16:44:10,464][model3_sft.py][INFO] Epoch:[1/2](23650/63764) loss:2.506 lr:0.0000010 epoch_Time:246.0min: [2023-12-11 16:44:28,944][model3_sft.py][INFO] Epoch:[1/2](23700/63764) loss:2.774 lr:0.0000010 epoch_Time:246.0min: [2023-12-11 16:44:47,619][model3_sft.py][INFO] Epoch:[1/2](23750/63764) loss:2.573 lr:0.0000010 epoch_Time:246.0min: [2023-12-11 16:45:06,060][model3_sft.py][INFO] Epoch:[1/2](23800/63764) loss:3.039 lr:0.0000010 epoch_Time:245.0min: [2023-12-11 16:45:24,538][model3_sft.py][INFO] Epoch:[1/2](23850/63764) loss:2.952 lr:0.0000010 epoch_Time:245.0min: [2023-12-11 16:45:43,045][model3_sft.py][INFO] Epoch:[1/2](23900/63764) loss:2.904 lr:0.0000010 epoch_Time:245.0min: [2023-12-11 16:46:01,521][model3_sft.py][INFO] Epoch:[1/2](23950/63764) loss:2.996 lr:0.0000010 epoch_Time:244.0min: [2023-12-11 16:46:19,976][model3_sft.py][INFO] Epoch:[1/2](24000/63764) loss:2.573 lr:0.0000010 epoch_Time:244.0min: [2023-12-11 16:46:38,408][model3_sft.py][INFO] Epoch:[1/2](24050/63764) loss:2.790 lr:0.0000010 epoch_Time:244.0min: [2023-12-11 16:46:56,854][model3_sft.py][INFO] Epoch:[1/2](24100/63764) loss:2.314 lr:0.0000010 epoch_Time:243.0min: [2023-12-11 16:47:15,369][model3_sft.py][INFO] Epoch:[1/2](24150/63764) loss:2.422 lr:0.0000010 epoch_Time:243.0min: [2023-12-11 16:47:33,817][model3_sft.py][INFO] Epoch:[1/2](24200/63764) loss:3.083 lr:0.0000010 epoch_Time:243.0min: [2023-12-11 16:47:52,295][model3_sft.py][INFO] Epoch:[1/2](24250/63764) loss:2.456 lr:0.0000010 epoch_Time:242.0min: [2023-12-11 16:48:10,793][model3_sft.py][INFO] Epoch:[1/2](24300/63764) loss:2.929 lr:0.0000010 epoch_Time:242.0min: [2023-12-11 16:48:29,310][model3_sft.py][INFO] Epoch:[1/2](24350/63764) loss:2.395 lr:0.0000010 epoch_Time:242.0min: [2023-12-11 16:48:47,836][model3_sft.py][INFO] Epoch:[1/2](24400/63764) loss:3.620 lr:0.0000010 epoch_Time:241.0min: [2023-12-11 16:49:06,338][model3_sft.py][INFO] Epoch:[1/2](24450/63764) loss:3.134 lr:0.0000010 epoch_Time:241.0min: [2023-12-11 16:49:24,816][model3_sft.py][INFO] Epoch:[1/2](24500/63764) loss:3.106 lr:0.0000010 epoch_Time:241.0min: [2023-12-11 16:49:43,288][model3_sft.py][INFO] Epoch:[1/2](24550/63764) loss:3.207 lr:0.0000010 epoch_Time:241.0min: [2023-12-11 16:50:01,737][model3_sft.py][INFO] Epoch:[1/2](24600/63764) loss:2.995 lr:0.0000010 epoch_Time:240.0min: [2023-12-11 16:50:20,159][model3_sft.py][INFO] Epoch:[1/2](24650/63764) loss:2.894 lr:0.0000010 epoch_Time:240.0min: [2023-12-11 16:50:38,807][model3_sft.py][INFO] Epoch:[1/2](24700/63764) loss:2.225 lr:0.0000010 epoch_Time:240.0min: [2023-12-11 16:50:57,244][model3_sft.py][INFO] Epoch:[1/2](24750/63764) loss:3.341 lr:0.0000010 epoch_Time:239.0min: [2023-12-11 16:51:15,695][model3_sft.py][INFO] Epoch:[1/2](24800/63764) loss:3.341 lr:0.0000010 epoch_Time:239.0min: [2023-12-11 16:51:34,179][model3_sft.py][INFO] Epoch:[1/2](24850/63764) loss:2.709 lr:0.0000010 epoch_Time:239.0min: [2023-12-11 16:51:52,667][model3_sft.py][INFO] Epoch:[1/2](24900/63764) loss:2.893 lr:0.0000010 epoch_Time:238.0min: [2023-12-11 16:52:11,151][model3_sft.py][INFO] Epoch:[1/2](24950/63764) loss:3.224 lr:0.0000010 epoch_Time:239.0min: [2023-12-11 16:52:29,606][model3_sft.py][INFO] Epoch:[1/2](25000/63764) loss:3.322 lr:0.0000010 epoch_Time:239.0min: [2023-12-11 16:52:48,116][model3_sft.py][INFO] Epoch:[1/2](25050/63764) loss:2.986 lr:0.0000010 epoch_Time:238.0min: [2023-12-11 16:53:06,605][model3_sft.py][INFO] Epoch:[1/2](25100/63764) loss:3.539 lr:0.0000010 epoch_Time:238.0min: [2023-12-11 16:53:25,104][model3_sft.py][INFO] Epoch:[1/2](25150/63764) loss:3.287 lr:0.0000010 epoch_Time:238.0min: [2023-12-11 16:53:43,559][model3_sft.py][INFO] Epoch:[1/2](25200/63764) loss:2.637 lr:0.0000010 epoch_Time:238.0min: [2023-12-11 16:54:02,049][model3_sft.py][INFO] Epoch:[1/2](25250/63764) loss:3.420 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 16:54:20,493][model3_sft.py][INFO] Epoch:[1/2](25300/63764) loss:3.159 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 16:54:38,971][model3_sft.py][INFO] Epoch:[1/2](25350/63764) loss:2.972 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 16:54:57,428][model3_sft.py][INFO] Epoch:[1/2](25400/63764) loss:2.355 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 16:55:15,916][model3_sft.py][INFO] Epoch:[1/2](25450/63764) loss:2.758 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 16:55:34,394][model3_sft.py][INFO] Epoch:[1/2](25500/63764) loss:3.195 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 16:55:52,848][model3_sft.py][INFO] Epoch:[1/2](25550/63764) loss:2.606 lr:0.0000010 epoch_Time:235.0min: [2023-12-11 16:56:11,316][model3_sft.py][INFO] Epoch:[1/2](25600/63764) loss:2.825 lr:0.0000010 epoch_Time:235.0min: [2023-12-11 16:56:30,022][model3_sft.py][INFO] Epoch:[1/2](25650/63764) loss:3.118 lr:0.0000010 epoch_Time:235.0min: [2023-12-11 16:56:48,479][model3_sft.py][INFO] Epoch:[1/2](25700/63764) loss:2.226 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 16:57:06,956][model3_sft.py][INFO] Epoch:[1/2](25750/63764) loss:2.940 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 16:57:25,410][model3_sft.py][INFO] Epoch:[1/2](25800/63764) loss:2.972 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 16:57:43,920][model3_sft.py][INFO] Epoch:[1/2](25850/63764) loss:2.895 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 16:58:02,385][model3_sft.py][INFO] Epoch:[1/2](25900/63764) loss:3.191 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 16:58:20,823][model3_sft.py][INFO] Epoch:[1/2](25950/63764) loss:3.319 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 16:58:39,288][model3_sft.py][INFO] Epoch:[1/2](26000/63764) loss:2.891 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 16:58:57,808][model3_sft.py][INFO] Epoch:[1/2](26050/63764) loss:2.450 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 16:59:16,292][model3_sft.py][INFO] Epoch:[1/2](26100/63764) loss:3.049 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 16:59:34,777][model3_sft.py][INFO] Epoch:[1/2](26150/63764) loss:2.626 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 16:59:53,285][model3_sft.py][INFO] Epoch:[1/2](26200/63764) loss:2.781 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 17:00:11,768][model3_sft.py][INFO] Epoch:[1/2](26250/63764) loss:2.771 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 17:00:30,199][model3_sft.py][INFO] Epoch:[1/2](26300/63764) loss:4.164 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 17:00:48,613][model3_sft.py][INFO] Epoch:[1/2](26350/63764) loss:3.035 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 17:01:07,081][model3_sft.py][INFO] Epoch:[1/2](26400/63764) loss:2.627 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 17:01:25,559][model3_sft.py][INFO] Epoch:[1/2](26450/63764) loss:2.773 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 17:01:44,043][model3_sft.py][INFO] Epoch:[1/2](26500/63764) loss:3.021 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 17:02:02,476][model3_sft.py][INFO] Epoch:[1/2](26550/63764) loss:3.135 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 17:02:21,252][model3_sft.py][INFO] Epoch:[1/2](26600/63764) loss:2.455 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 17:02:39,735][model3_sft.py][INFO] Epoch:[1/2](26650/63764) loss:2.929 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 17:02:58,206][model3_sft.py][INFO] Epoch:[1/2](26700/63764) loss:2.328 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 17:03:16,716][model3_sft.py][INFO] Epoch:[1/2](26750/63764) loss:3.471 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 17:03:35,227][model3_sft.py][INFO] Epoch:[1/2](26800/63764) loss:2.907 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 17:03:53,671][model3_sft.py][INFO] Epoch:[1/2](26850/63764) loss:2.776 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 17:04:12,162][model3_sft.py][INFO] Epoch:[1/2](26900/63764) loss:2.324 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 17:04:30,626][model3_sft.py][INFO] Epoch:[1/2](26950/63764) loss:2.372 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 17:04:49,102][model3_sft.py][INFO] Epoch:[1/2](27000/63764) loss:3.869 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 17:05:07,532][model3_sft.py][INFO] Epoch:[1/2](27050/63764) loss:2.514 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 17:05:26,014][model3_sft.py][INFO] Epoch:[1/2](27100/63764) loss:2.811 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 17:05:44,500][model3_sft.py][INFO] Epoch:[1/2](27150/63764) loss:2.602 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 17:06:02,984][model3_sft.py][INFO] Epoch:[1/2](27200/63764) loss:3.624 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 17:06:21,434][model3_sft.py][INFO] Epoch:[1/2](27250/63764) loss:2.555 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 17:06:39,889][model3_sft.py][INFO] Epoch:[1/2](27300/63764) loss:2.930 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 17:06:58,419][model3_sft.py][INFO] Epoch:[1/2](27350/63764) loss:2.632 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 17:07:16,882][model3_sft.py][INFO] Epoch:[1/2](27400/63764) loss:2.450 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 17:07:35,359][model3_sft.py][INFO] Epoch:[1/2](27450/63764) loss:3.246 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 17:07:53,808][model3_sft.py][INFO] Epoch:[1/2](27500/63764) loss:2.965 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 17:08:12,313][model3_sft.py][INFO] Epoch:[1/2](27550/63764) loss:2.924 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 17:08:30,998][model3_sft.py][INFO] Epoch:[1/2](27600/63764) loss:2.802 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 17:08:49,452][model3_sft.py][INFO] Epoch:[1/2](27650/63764) loss:3.613 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 17:09:07,910][model3_sft.py][INFO] Epoch:[1/2](27700/63764) loss:3.401 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 17:09:26,389][model3_sft.py][INFO] Epoch:[1/2](27750/63764) loss:2.970 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 17:09:44,893][model3_sft.py][INFO] Epoch:[1/2](27800/63764) loss:2.941 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 17:10:03,392][model3_sft.py][INFO] Epoch:[1/2](27850/63764) loss:3.098 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 17:10:21,839][model3_sft.py][INFO] Epoch:[1/2](27900/63764) loss:3.499 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 17:10:40,279][model3_sft.py][INFO] Epoch:[1/2](27950/63764) loss:3.216 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 17:10:58,728][model3_sft.py][INFO] Epoch:[1/2](28000/63764) loss:2.479 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 17:11:17,196][model3_sft.py][INFO] Epoch:[1/2](28050/63764) loss:3.279 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 17:11:35,662][model3_sft.py][INFO] Epoch:[1/2](28100/63764) loss:2.678 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 17:11:54,158][model3_sft.py][INFO] Epoch:[1/2](28150/63764) loss:3.460 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 17:12:12,600][model3_sft.py][INFO] Epoch:[1/2](28200/63764) loss:2.621 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 17:12:31,121][model3_sft.py][INFO] Epoch:[1/2](28250/63764) loss:2.925 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 17:12:49,629][model3_sft.py][INFO] Epoch:[1/2](28300/63764) loss:3.133 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 17:13:08,226][model3_sft.py][INFO] Epoch:[1/2](28350/63764) loss:2.778 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 17:13:26,744][model3_sft.py][INFO] Epoch:[1/2](28400/63764) loss:3.089 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 17:13:45,282][model3_sft.py][INFO] Epoch:[1/2](28450/63764) loss:2.690 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 17:14:03,783][model3_sft.py][INFO] Epoch:[1/2](28500/63764) loss:3.055 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 17:14:22,527][model3_sft.py][INFO] Epoch:[1/2](28550/63764) loss:3.115 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 17:14:41,044][model3_sft.py][INFO] Epoch:[1/2](28600/63764) loss:2.826 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 17:14:59,547][model3_sft.py][INFO] Epoch:[1/2](28650/63764) loss:2.463 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 17:15:18,089][model3_sft.py][INFO] Epoch:[1/2](28700/63764) loss:2.831 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 17:15:36,600][model3_sft.py][INFO] Epoch:[1/2](28750/63764) loss:2.632 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 17:15:55,125][model3_sft.py][INFO] Epoch:[1/2](28800/63764) loss:2.223 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 17:16:13,674][model3_sft.py][INFO] Epoch:[1/2](28850/63764) loss:2.336 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 17:16:32,238][model3_sft.py][INFO] Epoch:[1/2](28900/63764) loss:2.401 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 17:16:50,777][model3_sft.py][INFO] Epoch:[1/2](28950/63764) loss:3.762 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 17:17:09,295][model3_sft.py][INFO] Epoch:[1/2](29000/63764) loss:3.183 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 17:17:27,851][model3_sft.py][INFO] Epoch:[1/2](29050/63764) loss:2.962 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 17:17:46,406][model3_sft.py][INFO] Epoch:[1/2](29100/63764) loss:3.331 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 17:18:04,975][model3_sft.py][INFO] Epoch:[1/2](29150/63764) loss:2.717 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 17:18:23,492][model3_sft.py][INFO] Epoch:[1/2](29200/63764) loss:3.102 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 17:18:42,040][model3_sft.py][INFO] Epoch:[1/2](29250/63764) loss:2.269 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 17:19:00,558][model3_sft.py][INFO] Epoch:[1/2](29300/63764) loss:2.767 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 17:19:19,051][model3_sft.py][INFO] Epoch:[1/2](29350/63764) loss:3.199 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 17:19:37,476][model3_sft.py][INFO] Epoch:[1/2](29400/63764) loss:2.922 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 17:19:55,986][model3_sft.py][INFO] Epoch:[1/2](29450/63764) loss:2.925 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 17:20:14,703][model3_sft.py][INFO] Epoch:[1/2](29500/63764) loss:2.353 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 17:20:33,144][model3_sft.py][INFO] Epoch:[1/2](29550/63764) loss:2.709 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 17:20:51,586][model3_sft.py][INFO] Epoch:[1/2](29600/63764) loss:3.079 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 17:21:10,056][model3_sft.py][INFO] Epoch:[1/2](29650/63764) loss:3.091 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 17:21:28,548][model3_sft.py][INFO] Epoch:[1/2](29700/63764) loss:3.204 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 17:21:46,981][model3_sft.py][INFO] Epoch:[1/2](29750/63764) loss:2.805 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 17:22:05,453][model3_sft.py][INFO] Epoch:[1/2](29800/63764) loss:2.420 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 17:22:23,911][model3_sft.py][INFO] Epoch:[1/2](29850/63764) loss:2.674 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 17:22:42,365][model3_sft.py][INFO] Epoch:[1/2](29900/63764) loss:3.516 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 17:23:00,791][model3_sft.py][INFO] Epoch:[1/2](29950/63764) loss:2.779 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 17:23:19,307][model3_sft.py][INFO] Epoch:[1/2](30000/63764) loss:3.476 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 17:23:37,781][model3_sft.py][INFO] Epoch:[1/2](30050/63764) loss:3.258 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 17:23:56,272][model3_sft.py][INFO] Epoch:[1/2](30100/63764) loss:3.105 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 17:24:14,732][model3_sft.py][INFO] Epoch:[1/2](30150/63764) loss:2.986 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 17:24:33,166][model3_sft.py][INFO] Epoch:[1/2](30200/63764) loss:2.581 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 17:24:51,685][model3_sft.py][INFO] Epoch:[1/2](30250/63764) loss:2.688 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 17:25:10,167][model3_sft.py][INFO] Epoch:[1/2](30300/63764) loss:2.913 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 17:25:28,603][model3_sft.py][INFO] Epoch:[1/2](30350/63764) loss:2.920 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 17:25:47,084][model3_sft.py][INFO] Epoch:[1/2](30400/63764) loss:2.649 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 17:26:05,791][model3_sft.py][INFO] Epoch:[1/2](30450/63764) loss:3.475 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 17:26:24,268][model3_sft.py][INFO] Epoch:[1/2](30500/63764) loss:3.108 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 17:26:42,719][model3_sft.py][INFO] Epoch:[1/2](30550/63764) loss:2.899 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 17:27:01,150][model3_sft.py][INFO] Epoch:[1/2](30600/63764) loss:2.755 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 17:27:19,673][model3_sft.py][INFO] Epoch:[1/2](30650/63764) loss:2.965 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 17:27:38,132][model3_sft.py][INFO] Epoch:[1/2](30700/63764) loss:2.570 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 17:27:56,586][model3_sft.py][INFO] Epoch:[1/2](30750/63764) loss:2.764 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 17:28:15,088][model3_sft.py][INFO] Epoch:[1/2](30800/63764) loss:3.146 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 17:28:33,656][model3_sft.py][INFO] Epoch:[1/2](30850/63764) loss:2.977 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 17:28:52,130][model3_sft.py][INFO] Epoch:[1/2](30900/63764) loss:3.113 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 17:29:10,626][model3_sft.py][INFO] Epoch:[1/2](30950/63764) loss:2.800 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 17:29:29,160][model3_sft.py][INFO] Epoch:[1/2](31000/63764) loss:2.615 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 17:29:47,643][model3_sft.py][INFO] Epoch:[1/2](31050/63764) loss:2.965 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 17:30:06,113][model3_sft.py][INFO] Epoch:[1/2](31100/63764) loss:2.741 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 17:30:24,640][model3_sft.py][INFO] Epoch:[1/2](31150/63764) loss:3.149 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 17:30:43,144][model3_sft.py][INFO] Epoch:[1/2](31200/63764) loss:3.281 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 17:31:01,623][model3_sft.py][INFO] Epoch:[1/2](31250/63764) loss:3.610 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 17:31:20,098][model3_sft.py][INFO] Epoch:[1/2](31300/63764) loss:2.927 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 17:31:38,588][model3_sft.py][INFO] Epoch:[1/2](31350/63764) loss:3.356 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 17:31:57,322][model3_sft.py][INFO] Epoch:[1/2](31400/63764) loss:2.727 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 17:32:15,797][model3_sft.py][INFO] Epoch:[1/2](31450/63764) loss:2.527 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 17:32:34,262][model3_sft.py][INFO] Epoch:[1/2](31500/63764) loss:2.981 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 17:32:52,739][model3_sft.py][INFO] Epoch:[1/2](31550/63764) loss:2.695 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 17:33:11,172][model3_sft.py][INFO] Epoch:[1/2](31600/63764) loss:3.599 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 17:33:29,652][model3_sft.py][INFO] Epoch:[1/2](31650/63764) loss:3.197 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 17:33:48,108][model3_sft.py][INFO] Epoch:[1/2](31700/63764) loss:2.335 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 17:34:06,686][model3_sft.py][INFO] Epoch:[1/2](31750/63764) loss:2.743 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 17:34:25,211][model3_sft.py][INFO] Epoch:[1/2](31800/63764) loss:2.978 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 17:34:43,666][model3_sft.py][INFO] Epoch:[1/2](31850/63764) loss:3.247 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 17:35:02,144][model3_sft.py][INFO] Epoch:[1/2](31900/63764) loss:3.071 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 17:35:20,654][model3_sft.py][INFO] Epoch:[1/2](31950/63764) loss:2.946 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 17:35:39,196][model3_sft.py][INFO] Epoch:[1/2](32000/63764) loss:2.744 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 17:35:57,698][model3_sft.py][INFO] Epoch:[1/2](32050/63764) loss:2.476 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 17:36:16,151][model3_sft.py][INFO] Epoch:[1/2](32100/63764) loss:3.497 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 17:36:34,656][model3_sft.py][INFO] Epoch:[1/2](32150/63764) loss:2.595 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 17:36:53,145][model3_sft.py][INFO] Epoch:[1/2](32200/63764) loss:2.783 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 17:37:11,681][model3_sft.py][INFO] Epoch:[1/2](32250/63764) loss:2.755 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 17:37:30,151][model3_sft.py][INFO] Epoch:[1/2](32300/63764) loss:2.762 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 17:37:48,680][model3_sft.py][INFO] Epoch:[1/2](32350/63764) loss:3.057 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 17:38:07,383][model3_sft.py][INFO] Epoch:[1/2](32400/63764) loss:3.034 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 17:38:25,901][model3_sft.py][INFO] Epoch:[1/2](32450/63764) loss:3.194 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 17:38:44,377][model3_sft.py][INFO] Epoch:[1/2](32500/63764) loss:3.000 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 17:39:02,912][model3_sft.py][INFO] Epoch:[1/2](32550/63764) loss:2.596 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 17:39:21,412][model3_sft.py][INFO] Epoch:[1/2](32600/63764) loss:2.767 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 17:39:39,982][model3_sft.py][INFO] Epoch:[1/2](32650/63764) loss:2.867 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 17:39:58,535][model3_sft.py][INFO] Epoch:[1/2](32700/63764) loss:3.489 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 17:40:17,113][model3_sft.py][INFO] Epoch:[1/2](32750/63764) loss:2.389 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 17:40:35,677][model3_sft.py][INFO] Epoch:[1/2](32800/63764) loss:2.900 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 17:40:54,201][model3_sft.py][INFO] Epoch:[1/2](32850/63764) loss:3.147 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 17:41:12,713][model3_sft.py][INFO] Epoch:[1/2](32900/63764) loss:2.379 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 17:41:31,253][model3_sft.py][INFO] Epoch:[1/2](32950/63764) loss:3.256 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 17:41:49,772][model3_sft.py][INFO] Epoch:[1/2](33000/63764) loss:3.431 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 17:42:08,252][model3_sft.py][INFO] Epoch:[1/2](33050/63764) loss:2.712 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 17:42:26,781][model3_sft.py][INFO] Epoch:[1/2](33100/63764) loss:2.576 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 17:42:45,263][model3_sft.py][INFO] Epoch:[1/2](33150/63764) loss:3.438 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 17:43:03,722][model3_sft.py][INFO] Epoch:[1/2](33200/63764) loss:2.571 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 17:43:22,215][model3_sft.py][INFO] Epoch:[1/2](33250/63764) loss:3.965 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 17:43:40,691][model3_sft.py][INFO] Epoch:[1/2](33300/63764) loss:3.334 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 17:43:59,407][model3_sft.py][INFO] Epoch:[1/2](33350/63764) loss:2.814 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 17:44:17,869][model3_sft.py][INFO] Epoch:[1/2](33400/63764) loss:3.083 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 17:44:36,335][model3_sft.py][INFO] Epoch:[1/2](33450/63764) loss:3.758 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 17:44:54,881][model3_sft.py][INFO] Epoch:[1/2](33500/63764) loss:3.275 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 17:45:13,356][model3_sft.py][INFO] Epoch:[1/2](33550/63764) loss:3.301 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 17:45:31,815][model3_sft.py][INFO] Epoch:[1/2](33600/63764) loss:3.017 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 17:45:50,326][model3_sft.py][INFO] Epoch:[1/2](33650/63764) loss:2.922 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 17:46:08,860][model3_sft.py][INFO] Epoch:[1/2](33700/63764) loss:3.179 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 17:46:27,344][model3_sft.py][INFO] Epoch:[1/2](33750/63764) loss:3.327 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 17:46:45,805][model3_sft.py][INFO] Epoch:[1/2](33800/63764) loss:3.410 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 17:47:04,259][model3_sft.py][INFO] Epoch:[1/2](33850/63764) loss:2.893 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 17:47:22,716][model3_sft.py][INFO] Epoch:[1/2](33900/63764) loss:2.697 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 17:47:41,176][model3_sft.py][INFO] Epoch:[1/2](33950/63764) loss:3.151 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 17:47:59,633][model3_sft.py][INFO] Epoch:[1/2](34000/63764) loss:3.341 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 17:48:18,129][model3_sft.py][INFO] Epoch:[1/2](34050/63764) loss:2.383 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 17:48:36,594][model3_sft.py][INFO] Epoch:[1/2](34100/63764) loss:3.000 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 17:48:55,079][model3_sft.py][INFO] Epoch:[1/2](34150/63764) loss:3.123 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 17:49:13,597][model3_sft.py][INFO] Epoch:[1/2](34200/63764) loss:3.244 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 17:49:32,069][model3_sft.py][INFO] Epoch:[1/2](34250/63764) loss:3.461 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 17:49:50,784][model3_sft.py][INFO] Epoch:[1/2](34300/63764) loss:3.372 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 17:50:09,249][model3_sft.py][INFO] Epoch:[1/2](34350/63764) loss:2.862 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 17:50:27,755][model3_sft.py][INFO] Epoch:[1/2](34400/63764) loss:3.564 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 17:50:46,253][model3_sft.py][INFO] Epoch:[1/2](34450/63764) loss:2.872 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 17:51:04,796][model3_sft.py][INFO] Epoch:[1/2](34500/63764) loss:3.169 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 17:51:23,237][model3_sft.py][INFO] Epoch:[1/2](34550/63764) loss:2.819 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 17:51:41,676][model3_sft.py][INFO] Epoch:[1/2](34600/63764) loss:3.177 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 17:52:00,159][model3_sft.py][INFO] Epoch:[1/2](34650/63764) loss:2.715 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 17:52:18,617][model3_sft.py][INFO] Epoch:[1/2](34700/63764) loss:3.536 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 17:52:37,106][model3_sft.py][INFO] Epoch:[1/2](34750/63764) loss:2.992 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 17:52:55,545][model3_sft.py][INFO] Epoch:[1/2](34800/63764) loss:3.009 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 17:53:14,013][model3_sft.py][INFO] Epoch:[1/2](34850/63764) loss:2.945 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 17:53:32,423][model3_sft.py][INFO] Epoch:[1/2](34900/63764) loss:2.469 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 17:53:50,851][model3_sft.py][INFO] Epoch:[1/2](34950/63764) loss:3.002 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 17:54:09,353][model3_sft.py][INFO] Epoch:[1/2](35000/63764) loss:2.878 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 17:54:27,842][model3_sft.py][INFO] Epoch:[1/2](35050/63764) loss:2.561 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 17:54:46,276][model3_sft.py][INFO] Epoch:[1/2](35100/63764) loss:2.809 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 17:55:04,690][model3_sft.py][INFO] Epoch:[1/2](35150/63764) loss:2.775 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 17:55:23,127][model3_sft.py][INFO] Epoch:[1/2](35200/63764) loss:3.200 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 17:55:41,808][model3_sft.py][INFO] Epoch:[1/2](35250/63764) loss:3.227 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 17:56:00,282][model3_sft.py][INFO] Epoch:[1/2](35300/63764) loss:3.298 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 17:56:18,707][model3_sft.py][INFO] Epoch:[1/2](35350/63764) loss:3.251 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 17:56:37,184][model3_sft.py][INFO] Epoch:[1/2](35400/63764) loss:2.816 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 17:56:55,677][model3_sft.py][INFO] Epoch:[1/2](35450/63764) loss:2.867 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 17:57:14,129][model3_sft.py][INFO] Epoch:[1/2](35500/63764) loss:3.079 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 17:57:32,610][model3_sft.py][INFO] Epoch:[1/2](35550/63764) loss:2.999 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 17:57:51,080][model3_sft.py][INFO] Epoch:[1/2](35600/63764) loss:3.307 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 17:58:09,598][model3_sft.py][INFO] Epoch:[1/2](35650/63764) loss:3.006 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 17:58:28,075][model3_sft.py][INFO] Epoch:[1/2](35700/63764) loss:3.002 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 17:58:46,563][model3_sft.py][INFO] Epoch:[1/2](35750/63764) loss:3.515 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 17:59:05,043][model3_sft.py][INFO] Epoch:[1/2](35800/63764) loss:2.890 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 17:59:23,474][model3_sft.py][INFO] Epoch:[1/2](35850/63764) loss:2.751 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 17:59:41,951][model3_sft.py][INFO] Epoch:[1/2](35900/63764) loss:3.193 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 18:00:00,402][model3_sft.py][INFO] Epoch:[1/2](35950/63764) loss:3.383 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 18:00:18,854][model3_sft.py][INFO] Epoch:[1/2](36000/63764) loss:2.809 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 18:00:37,325][model3_sft.py][INFO] Epoch:[1/2](36050/63764) loss:3.212 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 18:00:55,809][model3_sft.py][INFO] Epoch:[1/2](36100/63764) loss:3.461 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 18:01:14,282][model3_sft.py][INFO] Epoch:[1/2](36150/63764) loss:2.967 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 18:01:32,990][model3_sft.py][INFO] Epoch:[1/2](36200/63764) loss:2.722 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 18:01:51,465][model3_sft.py][INFO] Epoch:[1/2](36250/63764) loss:2.928 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 18:02:09,919][model3_sft.py][INFO] Epoch:[1/2](36300/63764) loss:2.899 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 18:02:28,433][model3_sft.py][INFO] Epoch:[1/2](36350/63764) loss:3.741 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 18:02:46,877][model3_sft.py][INFO] Epoch:[1/2](36400/63764) loss:3.207 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 18:03:05,345][model3_sft.py][INFO] Epoch:[1/2](36450/63764) loss:2.800 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 18:03:23,787][model3_sft.py][INFO] Epoch:[1/2](36500/63764) loss:3.116 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 18:03:42,275][model3_sft.py][INFO] Epoch:[1/2](36550/63764) loss:2.695 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 18:04:00,720][model3_sft.py][INFO] Epoch:[1/2](36600/63764) loss:3.630 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 18:04:19,160][model3_sft.py][INFO] Epoch:[1/2](36650/63764) loss:2.280 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 18:04:37,649][model3_sft.py][INFO] Epoch:[1/2](36700/63764) loss:3.097 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 18:04:56,190][model3_sft.py][INFO] Epoch:[1/2](36750/63764) loss:3.098 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 18:05:14,638][model3_sft.py][INFO] Epoch:[1/2](36800/63764) loss:3.068 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 18:05:33,110][model3_sft.py][INFO] Epoch:[1/2](36850/63764) loss:3.082 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 18:05:51,571][model3_sft.py][INFO] Epoch:[1/2](36900/63764) loss:2.560 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 18:06:10,043][model3_sft.py][INFO] Epoch:[1/2](36950/63764) loss:3.021 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 18:06:28,509][model3_sft.py][INFO] Epoch:[1/2](37000/63764) loss:3.273 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 18:06:47,027][model3_sft.py][INFO] Epoch:[1/2](37050/63764) loss:2.337 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 18:07:05,549][model3_sft.py][INFO] Epoch:[1/2](37100/63764) loss:2.977 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 18:07:24,019][model3_sft.py][INFO] Epoch:[1/2](37150/63764) loss:3.669 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 18:07:42,679][model3_sft.py][INFO] Epoch:[1/2](37200/63764) loss:4.018 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 18:08:01,154][model3_sft.py][INFO] Epoch:[1/2](37250/63764) loss:3.212 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 18:08:19,621][model3_sft.py][INFO] Epoch:[1/2](37300/63764) loss:2.703 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 18:08:38,128][model3_sft.py][INFO] Epoch:[1/2](37350/63764) loss:2.760 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 18:08:56,595][model3_sft.py][INFO] Epoch:[1/2](37400/63764) loss:2.626 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 18:09:15,026][model3_sft.py][INFO] Epoch:[1/2](37450/63764) loss:3.154 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 18:09:33,510][model3_sft.py][INFO] Epoch:[1/2](37500/63764) loss:3.652 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 18:09:51,969][model3_sft.py][INFO] Epoch:[1/2](37550/63764) loss:4.062 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 18:10:10,429][model3_sft.py][INFO] Epoch:[1/2](37600/63764) loss:3.128 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 18:10:28,920][model3_sft.py][INFO] Epoch:[1/2](37650/63764) loss:3.400 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 18:10:47,407][model3_sft.py][INFO] Epoch:[1/2](37700/63764) loss:3.091 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 18:11:05,892][model3_sft.py][INFO] Epoch:[1/2](37750/63764) loss:3.319 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 18:11:24,314][model3_sft.py][INFO] Epoch:[1/2](37800/63764) loss:3.838 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 18:11:42,789][model3_sft.py][INFO] Epoch:[1/2](37850/63764) loss:3.106 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 18:12:01,248][model3_sft.py][INFO] Epoch:[1/2](37900/63764) loss:2.561 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 18:12:19,699][model3_sft.py][INFO] Epoch:[1/2](37950/63764) loss:3.038 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 18:12:38,217][model3_sft.py][INFO] Epoch:[1/2](38000/63764) loss:2.891 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 18:12:56,658][model3_sft.py][INFO] Epoch:[1/2](38050/63764) loss:3.433 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 18:13:15,143][model3_sft.py][INFO] Epoch:[1/2](38100/63764) loss:2.301 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 18:13:33,828][model3_sft.py][INFO] Epoch:[1/2](38150/63764) loss:2.927 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 18:13:52,360][model3_sft.py][INFO] Epoch:[1/2](38200/63764) loss:3.143 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 18:14:10,820][model3_sft.py][INFO] Epoch:[1/2](38250/63764) loss:3.683 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 18:14:29,347][model3_sft.py][INFO] Epoch:[1/2](38300/63764) loss:2.604 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 18:14:47,833][model3_sft.py][INFO] Epoch:[1/2](38350/63764) loss:3.603 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 18:15:06,338][model3_sft.py][INFO] Epoch:[1/2](38400/63764) loss:3.217 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 18:15:24,863][model3_sft.py][INFO] Epoch:[1/2](38450/63764) loss:3.573 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 18:15:43,354][model3_sft.py][INFO] Epoch:[1/2](38500/63764) loss:2.212 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 18:16:01,817][model3_sft.py][INFO] Epoch:[1/2](38550/63764) loss:3.584 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 18:16:20,308][model3_sft.py][INFO] Epoch:[1/2](38600/63764) loss:2.951 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 18:16:38,786][model3_sft.py][INFO] Epoch:[1/2](38650/63764) loss:2.855 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 18:16:57,288][model3_sft.py][INFO] Epoch:[1/2](38700/63764) loss:2.910 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 18:17:15,828][model3_sft.py][INFO] Epoch:[1/2](38750/63764) loss:2.154 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 18:17:34,333][model3_sft.py][INFO] Epoch:[1/2](38800/63764) loss:2.813 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 18:17:52,813][model3_sft.py][INFO] Epoch:[1/2](38850/63764) loss:3.391 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 18:18:11,305][model3_sft.py][INFO] Epoch:[1/2](38900/63764) loss:2.446 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 18:18:29,749][model3_sft.py][INFO] Epoch:[1/2](38950/63764) loss:3.738 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 18:18:48,214][model3_sft.py][INFO] Epoch:[1/2](39000/63764) loss:3.207 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 18:19:06,681][model3_sft.py][INFO] Epoch:[1/2](39050/63764) loss:2.609 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 18:19:25,352][model3_sft.py][INFO] Epoch:[1/2](39100/63764) loss:2.826 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 18:19:43,811][model3_sft.py][INFO] Epoch:[1/2](39150/63764) loss:2.899 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 18:20:02,268][model3_sft.py][INFO] Epoch:[1/2](39200/63764) loss:3.304 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 18:20:20,768][model3_sft.py][INFO] Epoch:[1/2](39250/63764) loss:3.558 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 18:20:39,219][model3_sft.py][INFO] Epoch:[1/2](39300/63764) loss:2.866 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 18:20:57,690][model3_sft.py][INFO] Epoch:[1/2](39350/63764) loss:2.857 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 18:21:16,151][model3_sft.py][INFO] Epoch:[1/2](39400/63764) loss:3.004 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 18:21:34,623][model3_sft.py][INFO] Epoch:[1/2](39450/63764) loss:3.240 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 18:21:53,073][model3_sft.py][INFO] Epoch:[1/2](39500/63764) loss:3.333 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 18:22:11,602][model3_sft.py][INFO] Epoch:[1/2](39550/63764) loss:3.307 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 18:22:30,075][model3_sft.py][INFO] Epoch:[1/2](39600/63764) loss:3.661 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 18:22:48,608][model3_sft.py][INFO] Epoch:[1/2](39650/63764) loss:3.409 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 18:23:07,011][model3_sft.py][INFO] Epoch:[1/2](39700/63764) loss:3.072 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 18:23:25,472][model3_sft.py][INFO] Epoch:[1/2](39750/63764) loss:3.081 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 18:23:43,945][model3_sft.py][INFO] Epoch:[1/2](39800/63764) loss:3.936 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 18:24:02,402][model3_sft.py][INFO] Epoch:[1/2](39850/63764) loss:3.338 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 18:24:20,842][model3_sft.py][INFO] Epoch:[1/2](39900/63764) loss:2.379 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 18:24:39,325][model3_sft.py][INFO] Epoch:[1/2](39950/63764) loss:3.079 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 18:24:57,828][model3_sft.py][INFO] Epoch:[1/2](40000/63764) loss:3.179 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 18:25:16,490][model3_sft.py][INFO] Epoch:[1/2](40050/63764) loss:2.624 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 18:25:34,952][model3_sft.py][INFO] Epoch:[1/2](40100/63764) loss:2.989 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 18:25:53,363][model3_sft.py][INFO] Epoch:[1/2](40150/63764) loss:3.438 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 18:26:11,845][model3_sft.py][INFO] Epoch:[1/2](40200/63764) loss:2.746 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 18:26:30,317][model3_sft.py][INFO] Epoch:[1/2](40250/63764) loss:3.271 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 18:26:48,796][model3_sft.py][INFO] Epoch:[1/2](40300/63764) loss:2.844 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 18:27:07,257][model3_sft.py][INFO] Epoch:[1/2](40350/63764) loss:3.311 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 18:27:25,708][model3_sft.py][INFO] Epoch:[1/2](40400/63764) loss:3.422 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 18:27:44,174][model3_sft.py][INFO] Epoch:[1/2](40450/63764) loss:3.710 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 18:28:02,648][model3_sft.py][INFO] Epoch:[1/2](40500/63764) loss:2.589 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 18:28:21,093][model3_sft.py][INFO] Epoch:[1/2](40550/63764) loss:3.674 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 18:28:39,568][model3_sft.py][INFO] Epoch:[1/2](40600/63764) loss:3.212 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 18:28:58,048][model3_sft.py][INFO] Epoch:[1/2](40650/63764) loss:2.994 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 18:29:16,572][model3_sft.py][INFO] Epoch:[1/2](40700/63764) loss:2.417 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 18:29:35,101][model3_sft.py][INFO] Epoch:[1/2](40750/63764) loss:3.169 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 18:29:53,562][model3_sft.py][INFO] Epoch:[1/2](40800/63764) loss:2.077 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 18:30:12,051][model3_sft.py][INFO] Epoch:[1/2](40850/63764) loss:3.126 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 18:30:30,522][model3_sft.py][INFO] Epoch:[1/2](40900/63764) loss:3.425 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 18:30:49,021][model3_sft.py][INFO] Epoch:[1/2](40950/63764) loss:3.171 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 18:31:07,713][model3_sft.py][INFO] Epoch:[1/2](41000/63764) loss:2.873 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 18:31:26,221][model3_sft.py][INFO] Epoch:[1/2](41050/63764) loss:3.260 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 18:31:44,645][model3_sft.py][INFO] Epoch:[1/2](41100/63764) loss:3.584 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 18:32:03,146][model3_sft.py][INFO] Epoch:[1/2](41150/63764) loss:2.981 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 18:32:21,594][model3_sft.py][INFO] Epoch:[1/2](41200/63764) loss:3.060 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 18:32:40,072][model3_sft.py][INFO] Epoch:[1/2](41250/63764) loss:3.104 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 18:32:58,507][model3_sft.py][INFO] Epoch:[1/2](41300/63764) loss:3.100 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 18:33:16,985][model3_sft.py][INFO] Epoch:[1/2](41350/63764) loss:3.843 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 18:33:35,480][model3_sft.py][INFO] Epoch:[1/2](41400/63764) loss:3.467 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 18:33:53,954][model3_sft.py][INFO] Epoch:[1/2](41450/63764) loss:3.614 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 18:34:12,415][model3_sft.py][INFO] Epoch:[1/2](41500/63764) loss:3.021 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 18:34:30,906][model3_sft.py][INFO] Epoch:[1/2](41550/63764) loss:3.674 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 18:34:49,354][model3_sft.py][INFO] Epoch:[1/2](41600/63764) loss:3.366 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 18:35:07,841][model3_sft.py][INFO] Epoch:[1/2](41650/63764) loss:3.569 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 18:35:26,318][model3_sft.py][INFO] Epoch:[1/2](41700/63764) loss:3.000 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 18:35:44,797][model3_sft.py][INFO] Epoch:[1/2](41750/63764) loss:3.185 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 18:36:03,263][model3_sft.py][INFO] Epoch:[1/2](41800/63764) loss:2.917 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 18:36:21,750][model3_sft.py][INFO] Epoch:[1/2](41850/63764) loss:2.361 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 18:36:40,246][model3_sft.py][INFO] Epoch:[1/2](41900/63764) loss:3.419 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 18:36:58,706][model3_sft.py][INFO] Epoch:[1/2](41950/63764) loss:3.061 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 18:37:17,379][model3_sft.py][INFO] Epoch:[1/2](42000/63764) loss:3.555 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 18:37:35,835][model3_sft.py][INFO] Epoch:[1/2](42050/63764) loss:3.242 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 18:37:54,370][model3_sft.py][INFO] Epoch:[1/2](42100/63764) loss:2.846 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 18:38:12,862][model3_sft.py][INFO] Epoch:[1/2](42150/63764) loss:3.234 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 18:38:31,371][model3_sft.py][INFO] Epoch:[1/2](42200/63764) loss:2.791 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 18:38:49,875][model3_sft.py][INFO] Epoch:[1/2](42250/63764) loss:3.571 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 18:39:08,390][model3_sft.py][INFO] Epoch:[1/2](42300/63764) loss:3.076 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 18:39:26,842][model3_sft.py][INFO] Epoch:[1/2](42350/63764) loss:3.597 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 18:39:45,328][model3_sft.py][INFO] Epoch:[1/2](42400/63764) loss:2.597 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 18:40:03,799][model3_sft.py][INFO] Epoch:[1/2](42450/63764) loss:3.408 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 18:40:22,301][model3_sft.py][INFO] Epoch:[1/2](42500/63764) loss:2.599 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 18:40:40,752][model3_sft.py][INFO] Epoch:[1/2](42550/63764) loss:3.184 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 18:40:59,254][model3_sft.py][INFO] Epoch:[1/2](42600/63764) loss:2.919 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 18:41:17,711][model3_sft.py][INFO] Epoch:[1/2](42650/63764) loss:3.371 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 18:41:36,221][model3_sft.py][INFO] Epoch:[1/2](42700/63764) loss:3.247 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 18:41:54,706][model3_sft.py][INFO] Epoch:[1/2](42750/63764) loss:3.141 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 18:42:13,199][model3_sft.py][INFO] Epoch:[1/2](42800/63764) loss:3.101 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 18:42:31,731][model3_sft.py][INFO] Epoch:[1/2](42850/63764) loss:2.877 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 18:42:50,188][model3_sft.py][INFO] Epoch:[1/2](42900/63764) loss:3.150 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 18:43:08,852][model3_sft.py][INFO] Epoch:[1/2](42950/63764) loss:2.399 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 18:43:27,267][model3_sft.py][INFO] Epoch:[1/2](43000/63764) loss:2.686 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 18:43:45,779][model3_sft.py][INFO] Epoch:[1/2](43050/63764) loss:3.427 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 18:44:04,270][model3_sft.py][INFO] Epoch:[1/2](43100/63764) loss:3.360 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 18:44:22,729][model3_sft.py][INFO] Epoch:[1/2](43150/63764) loss:3.084 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 18:44:41,174][model3_sft.py][INFO] Epoch:[1/2](43200/63764) loss:3.319 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 18:44:59,705][model3_sft.py][INFO] Epoch:[1/2](43250/63764) loss:3.033 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 18:45:18,176][model3_sft.py][INFO] Epoch:[1/2](43300/63764) loss:3.044 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 18:45:36,667][model3_sft.py][INFO] Epoch:[1/2](43350/63764) loss:3.356 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 18:45:55,112][model3_sft.py][INFO] Epoch:[1/2](43400/63764) loss:3.073 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 18:46:13,636][model3_sft.py][INFO] Epoch:[1/2](43450/63764) loss:2.798 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 18:46:32,126][model3_sft.py][INFO] Epoch:[1/2](43500/63764) loss:3.450 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 18:46:50,642][model3_sft.py][INFO] Epoch:[1/2](43550/63764) loss:2.872 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 18:47:09,101][model3_sft.py][INFO] Epoch:[1/2](43600/63764) loss:3.249 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 18:47:27,588][model3_sft.py][INFO] Epoch:[1/2](43650/63764) loss:2.997 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 18:47:46,104][model3_sft.py][INFO] Epoch:[1/2](43700/63764) loss:2.803 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 18:48:04,567][model3_sft.py][INFO] Epoch:[1/2](43750/63764) loss:3.518 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 18:48:23,026][model3_sft.py][INFO] Epoch:[1/2](43800/63764) loss:3.121 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 18:48:41,497][model3_sft.py][INFO] Epoch:[1/2](43850/63764) loss:3.294 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 18:49:00,173][model3_sft.py][INFO] Epoch:[1/2](43900/63764) loss:3.081 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 18:49:18,660][model3_sft.py][INFO] Epoch:[1/2](43950/63764) loss:3.610 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 18:49:37,117][model3_sft.py][INFO] Epoch:[1/2](44000/63764) loss:3.735 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 18:49:55,596][model3_sft.py][INFO] Epoch:[1/2](44050/63764) loss:3.282 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 18:50:14,137][model3_sft.py][INFO] Epoch:[1/2](44100/63764) loss:3.313 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 18:50:32,598][model3_sft.py][INFO] Epoch:[1/2](44150/63764) loss:2.967 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 18:50:51,088][model3_sft.py][INFO] Epoch:[1/2](44200/63764) loss:3.238 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 18:51:09,544][model3_sft.py][INFO] Epoch:[1/2](44250/63764) loss:2.931 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 18:51:28,010][model3_sft.py][INFO] Epoch:[1/2](44300/63764) loss:3.651 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 18:51:46,451][model3_sft.py][INFO] Epoch:[1/2](44350/63764) loss:3.434 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 18:52:04,917][model3_sft.py][INFO] Epoch:[1/2](44400/63764) loss:3.004 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 18:52:23,368][model3_sft.py][INFO] Epoch:[1/2](44450/63764) loss:3.508 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 18:52:41,827][model3_sft.py][INFO] Epoch:[1/2](44500/63764) loss:3.049 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 18:53:00,319][model3_sft.py][INFO] Epoch:[1/2](44550/63764) loss:3.763 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 18:53:18,875][model3_sft.py][INFO] Epoch:[1/2](44600/63764) loss:2.769 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 18:53:37,354][model3_sft.py][INFO] Epoch:[1/2](44650/63764) loss:3.346 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 18:53:55,810][model3_sft.py][INFO] Epoch:[1/2](44700/63764) loss:4.075 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 18:54:14,319][model3_sft.py][INFO] Epoch:[1/2](44750/63764) loss:3.002 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 18:54:32,789][model3_sft.py][INFO] Epoch:[1/2](44800/63764) loss:3.236 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 18:54:51,445][model3_sft.py][INFO] Epoch:[1/2](44850/63764) loss:3.428 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 18:55:09,940][model3_sft.py][INFO] Epoch:[1/2](44900/63764) loss:3.279 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 18:55:28,415][model3_sft.py][INFO] Epoch:[1/2](44950/63764) loss:3.339 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 18:55:46,838][model3_sft.py][INFO] Epoch:[1/2](45000/63764) loss:3.572 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 18:56:05,275][model3_sft.py][INFO] Epoch:[1/2](45050/63764) loss:3.069 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 18:56:23,755][model3_sft.py][INFO] Epoch:[1/2](45100/63764) loss:3.754 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 18:56:42,237][model3_sft.py][INFO] Epoch:[1/2](45150/63764) loss:3.440 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 18:57:00,703][model3_sft.py][INFO] Epoch:[1/2](45200/63764) loss:3.332 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 18:57:19,158][model3_sft.py][INFO] Epoch:[1/2](45250/63764) loss:3.980 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 18:57:37,621][model3_sft.py][INFO] Epoch:[1/2](45300/63764) loss:3.685 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 18:57:56,116][model3_sft.py][INFO] Epoch:[1/2](45350/63764) loss:3.083 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 18:58:14,569][model3_sft.py][INFO] Epoch:[1/2](45400/63764) loss:2.849 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 18:58:33,023][model3_sft.py][INFO] Epoch:[1/2](45450/63764) loss:2.628 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 18:58:51,517][model3_sft.py][INFO] Epoch:[1/2](45500/63764) loss:3.253 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 18:59:10,000][model3_sft.py][INFO] Epoch:[1/2](45550/63764) loss:2.775 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 18:59:28,473][model3_sft.py][INFO] Epoch:[1/2](45600/63764) loss:3.336 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 18:59:46,946][model3_sft.py][INFO] Epoch:[1/2](45650/63764) loss:2.836 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 19:00:05,410][model3_sft.py][INFO] Epoch:[1/2](45700/63764) loss:3.489 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 19:00:23,861][model3_sft.py][INFO] Epoch:[1/2](45750/63764) loss:3.055 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 19:00:42,541][model3_sft.py][INFO] Epoch:[1/2](45800/63764) loss:3.946 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 19:01:01,038][model3_sft.py][INFO] Epoch:[1/2](45850/63764) loss:2.896 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 19:01:19,543][model3_sft.py][INFO] Epoch:[1/2](45900/63764) loss:2.969 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 19:01:38,017][model3_sft.py][INFO] Epoch:[1/2](45950/63764) loss:3.536 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 19:01:56,465][model3_sft.py][INFO] Epoch:[1/2](46000/63764) loss:3.601 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 19:02:14,955][model3_sft.py][INFO] Epoch:[1/2](46050/63764) loss:3.362 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 19:02:33,400][model3_sft.py][INFO] Epoch:[1/2](46100/63764) loss:2.611 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 19:02:51,859][model3_sft.py][INFO] Epoch:[1/2](46150/63764) loss:3.254 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 19:03:10,311][model3_sft.py][INFO] Epoch:[1/2](46200/63764) loss:2.857 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 19:03:28,809][model3_sft.py][INFO] Epoch:[1/2](46250/63764) loss:2.982 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 19:03:47,284][model3_sft.py][INFO] Epoch:[1/2](46300/63764) loss:3.198 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 19:04:05,747][model3_sft.py][INFO] Epoch:[1/2](46350/63764) loss:3.316 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 19:04:24,210][model3_sft.py][INFO] Epoch:[1/2](46400/63764) loss:3.733 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 19:04:42,683][model3_sft.py][INFO] Epoch:[1/2](46450/63764) loss:3.211 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 19:05:01,214][model3_sft.py][INFO] Epoch:[1/2](46500/63764) loss:2.412 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 19:05:19,631][model3_sft.py][INFO] Epoch:[1/2](46550/63764) loss:2.900 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 19:05:38,095][model3_sft.py][INFO] Epoch:[1/2](46600/63764) loss:3.584 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 19:05:56,569][model3_sft.py][INFO] Epoch:[1/2](46650/63764) loss:2.764 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 19:06:15,041][model3_sft.py][INFO] Epoch:[1/2](46700/63764) loss:3.155 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 19:06:33,488][model3_sft.py][INFO] Epoch:[1/2](46750/63764) loss:3.256 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 19:06:52,154][model3_sft.py][INFO] Epoch:[1/2](46800/63764) loss:3.452 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 19:07:10,688][model3_sft.py][INFO] Epoch:[1/2](46850/63764) loss:3.734 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 19:07:29,198][model3_sft.py][INFO] Epoch:[1/2](46900/63764) loss:2.592 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 19:07:47,732][model3_sft.py][INFO] Epoch:[1/2](46950/63764) loss:4.025 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 19:08:06,200][model3_sft.py][INFO] Epoch:[1/2](47000/63764) loss:3.080 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 19:08:24,639][model3_sft.py][INFO] Epoch:[1/2](47050/63764) loss:3.537 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 19:08:43,099][model3_sft.py][INFO] Epoch:[1/2](47100/63764) loss:3.332 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 19:09:01,539][model3_sft.py][INFO] Epoch:[1/2](47150/63764) loss:2.501 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 19:09:19,973][model3_sft.py][INFO] Epoch:[1/2](47200/63764) loss:3.070 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 19:09:38,479][model3_sft.py][INFO] Epoch:[1/2](47250/63764) loss:2.776 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 19:09:56,920][model3_sft.py][INFO] Epoch:[1/2](47300/63764) loss:3.434 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 19:10:15,400][model3_sft.py][INFO] Epoch:[1/2](47350/63764) loss:3.081 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 19:10:33,850][model3_sft.py][INFO] Epoch:[1/2](47400/63764) loss:3.784 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 19:10:52,326][model3_sft.py][INFO] Epoch:[1/2](47450/63764) loss:3.118 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 19:11:10,846][model3_sft.py][INFO] Epoch:[1/2](47500/63764) loss:3.327 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 19:11:29,303][model3_sft.py][INFO] Epoch:[1/2](47550/63764) loss:3.384 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 19:11:47,790][model3_sft.py][INFO] Epoch:[1/2](47600/63764) loss:2.907 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 19:12:06,231][model3_sft.py][INFO] Epoch:[1/2](47650/63764) loss:3.882 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 19:12:24,703][model3_sft.py][INFO] Epoch:[1/2](47700/63764) loss:2.716 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 19:12:43,402][model3_sft.py][INFO] Epoch:[1/2](47750/63764) loss:3.596 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 19:13:01,848][model3_sft.py][INFO] Epoch:[1/2](47800/63764) loss:3.651 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 19:13:20,347][model3_sft.py][INFO] Epoch:[1/2](47850/63764) loss:3.037 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 19:13:38,809][model3_sft.py][INFO] Epoch:[1/2](47900/63764) loss:2.975 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 19:13:57,321][model3_sft.py][INFO] Epoch:[1/2](47950/63764) loss:3.655 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 19:14:15,845][model3_sft.py][INFO] Epoch:[1/2](48000/63764) loss:3.700 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 19:14:34,366][model3_sft.py][INFO] Epoch:[1/2](48050/63764) loss:3.277 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 19:14:52,869][model3_sft.py][INFO] Epoch:[1/2](48100/63764) loss:3.668 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 19:15:11,398][model3_sft.py][INFO] Epoch:[1/2](48150/63764) loss:3.101 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 19:15:29,910][model3_sft.py][INFO] Epoch:[1/2](48200/63764) loss:2.693 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 19:15:48,437][model3_sft.py][INFO] Epoch:[1/2](48250/63764) loss:3.318 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 19:16:06,988][model3_sft.py][INFO] Epoch:[1/2](48300/63764) loss:3.448 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 19:16:25,490][model3_sft.py][INFO] Epoch:[1/2](48350/63764) loss:3.414 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 19:16:44,017][model3_sft.py][INFO] Epoch:[1/2](48400/63764) loss:3.193 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 19:17:02,553][model3_sft.py][INFO] Epoch:[1/2](48450/63764) loss:3.545 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 19:17:21,020][model3_sft.py][INFO] Epoch:[1/2](48500/63764) loss:3.154 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 19:17:39,589][model3_sft.py][INFO] Epoch:[1/2](48550/63764) loss:2.835 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 19:17:58,065][model3_sft.py][INFO] Epoch:[1/2](48600/63764) loss:3.752 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 19:18:16,524][model3_sft.py][INFO] Epoch:[1/2](48650/63764) loss:3.929 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 19:18:35,241][model3_sft.py][INFO] Epoch:[1/2](48700/63764) loss:4.009 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 19:18:53,738][model3_sft.py][INFO] Epoch:[1/2](48750/63764) loss:2.783 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 19:19:12,269][model3_sft.py][INFO] Epoch:[1/2](48800/63764) loss:3.352 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 19:19:30,759][model3_sft.py][INFO] Epoch:[1/2](48850/63764) loss:3.704 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 19:19:49,233][model3_sft.py][INFO] Epoch:[1/2](48900/63764) loss:3.817 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 19:20:07,659][model3_sft.py][INFO] Epoch:[1/2](48950/63764) loss:3.801 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 19:20:26,138][model3_sft.py][INFO] Epoch:[1/2](49000/63764) loss:3.118 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 19:20:44,581][model3_sft.py][INFO] Epoch:[1/2](49050/63764) loss:2.296 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 19:21:03,055][model3_sft.py][INFO] Epoch:[1/2](49100/63764) loss:3.179 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 19:21:21,559][model3_sft.py][INFO] Epoch:[1/2](49150/63764) loss:2.805 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 19:21:40,045][model3_sft.py][INFO] Epoch:[1/2](49200/63764) loss:2.579 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 19:21:58,513][model3_sft.py][INFO] Epoch:[1/2](49250/63764) loss:3.501 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 19:22:16,986][model3_sft.py][INFO] Epoch:[1/2](49300/63764) loss:2.947 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 19:22:35,454][model3_sft.py][INFO] Epoch:[1/2](49350/63764) loss:2.982 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 19:22:53,946][model3_sft.py][INFO] Epoch:[1/2](49400/63764) loss:3.114 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 19:23:12,416][model3_sft.py][INFO] Epoch:[1/2](49450/63764) loss:3.490 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 19:23:30,942][model3_sft.py][INFO] Epoch:[1/2](49500/63764) loss:3.531 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 19:23:49,420][model3_sft.py][INFO] Epoch:[1/2](49550/63764) loss:3.215 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 19:24:07,843][model3_sft.py][INFO] Epoch:[1/2](49600/63764) loss:2.902 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 19:24:26,563][model3_sft.py][INFO] Epoch:[1/2](49650/63764) loss:3.347 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 19:24:45,031][model3_sft.py][INFO] Epoch:[1/2](49700/63764) loss:4.203 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 19:25:03,547][model3_sft.py][INFO] Epoch:[1/2](49750/63764) loss:3.020 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 19:25:21,994][model3_sft.py][INFO] Epoch:[1/2](49800/63764) loss:2.746 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 19:25:40,411][model3_sft.py][INFO] Epoch:[1/2](49850/63764) loss:3.029 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 19:25:58,854][model3_sft.py][INFO] Epoch:[1/2](49900/63764) loss:2.834 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 19:26:17,352][model3_sft.py][INFO] Epoch:[1/2](49950/63764) loss:3.482 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 19:26:35,875][model3_sft.py][INFO] Epoch:[1/2](50000/63764) loss:3.012 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 19:26:54,343][model3_sft.py][INFO] Epoch:[1/2](50050/63764) loss:2.917 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 19:27:12,838][model3_sft.py][INFO] Epoch:[1/2](50100/63764) loss:3.441 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 19:27:31,348][model3_sft.py][INFO] Epoch:[1/2](50150/63764) loss:2.980 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 19:27:49,803][model3_sft.py][INFO] Epoch:[1/2](50200/63764) loss:3.449 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 19:28:08,262][model3_sft.py][INFO] Epoch:[1/2](50250/63764) loss:3.070 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 19:28:26,757][model3_sft.py][INFO] Epoch:[1/2](50300/63764) loss:2.561 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 19:28:45,247][model3_sft.py][INFO] Epoch:[1/2](50350/63764) loss:3.337 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 19:29:03,735][model3_sft.py][INFO] Epoch:[1/2](50400/63764) loss:3.180 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 19:29:22,253][model3_sft.py][INFO] Epoch:[1/2](50450/63764) loss:3.250 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 19:29:40,759][model3_sft.py][INFO] Epoch:[1/2](50500/63764) loss:2.833 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 19:29:59,217][model3_sft.py][INFO] Epoch:[1/2](50550/63764) loss:3.225 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 19:30:17,875][model3_sft.py][INFO] Epoch:[1/2](50600/63764) loss:3.462 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 19:30:36,377][model3_sft.py][INFO] Epoch:[1/2](50650/63764) loss:2.991 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 19:30:54,859][model3_sft.py][INFO] Epoch:[1/2](50700/63764) loss:3.335 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 19:31:13,338][model3_sft.py][INFO] Epoch:[1/2](50750/63764) loss:2.418 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 19:31:31,841][model3_sft.py][INFO] Epoch:[1/2](50800/63764) loss:3.668 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 19:31:50,314][model3_sft.py][INFO] Epoch:[1/2](50850/63764) loss:3.608 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 19:32:08,804][model3_sft.py][INFO] Epoch:[1/2](50900/63764) loss:2.678 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 19:32:27,256][model3_sft.py][INFO] Epoch:[1/2](50950/63764) loss:2.888 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 19:32:45,713][model3_sft.py][INFO] Epoch:[1/2](51000/63764) loss:3.077 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 19:33:04,258][model3_sft.py][INFO] Epoch:[1/2](51050/63764) loss:3.792 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 19:33:22,721][model3_sft.py][INFO] Epoch:[1/2](51100/63764) loss:3.348 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 19:33:41,211][model3_sft.py][INFO] Epoch:[1/2](51150/63764) loss:3.783 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 19:33:59,688][model3_sft.py][INFO] Epoch:[1/2](51200/63764) loss:3.222 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 19:34:18,180][model3_sft.py][INFO] Epoch:[1/2](51250/63764) loss:3.037 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 19:34:36,655][model3_sft.py][INFO] Epoch:[1/2](51300/63764) loss:3.391 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 19:34:55,132][model3_sft.py][INFO] Epoch:[1/2](51350/63764) loss:3.241 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 19:35:13,596][model3_sft.py][INFO] Epoch:[1/2](51400/63764) loss:3.070 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 19:35:32,073][model3_sft.py][INFO] Epoch:[1/2](51450/63764) loss:3.132 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 19:35:50,550][model3_sft.py][INFO] Epoch:[1/2](51500/63764) loss:3.050 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 19:36:09,002][model3_sft.py][INFO] Epoch:[1/2](51550/63764) loss:3.552 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 19:36:27,671][model3_sft.py][INFO] Epoch:[1/2](51600/63764) loss:3.502 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 19:36:46,169][model3_sft.py][INFO] Epoch:[1/2](51650/63764) loss:3.056 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 19:37:04,603][model3_sft.py][INFO] Epoch:[1/2](51700/63764) loss:3.164 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 19:37:23,096][model3_sft.py][INFO] Epoch:[1/2](51750/63764) loss:3.623 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 19:37:41,636][model3_sft.py][INFO] Epoch:[1/2](51800/63764) loss:3.327 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 19:38:00,100][model3_sft.py][INFO] Epoch:[1/2](51850/63764) loss:3.977 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 19:38:18,488][model3_sft.py][INFO] Epoch:[1/2](51900/63764) loss:3.750 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 19:38:37,012][model3_sft.py][INFO] Epoch:[1/2](51950/63764) loss:3.439 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 19:38:55,521][model3_sft.py][INFO] Epoch:[1/2](52000/63764) loss:3.636 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 19:39:13,973][model3_sft.py][INFO] Epoch:[1/2](52050/63764) loss:2.924 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 19:39:32,417][model3_sft.py][INFO] Epoch:[1/2](52100/63764) loss:3.803 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 19:39:50,920][model3_sft.py][INFO] Epoch:[1/2](52150/63764) loss:3.117 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 19:40:09,396][model3_sft.py][INFO] Epoch:[1/2](52200/63764) loss:3.432 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 19:40:27,864][model3_sft.py][INFO] Epoch:[1/2](52250/63764) loss:4.253 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 19:40:46,334][model3_sft.py][INFO] Epoch:[1/2](52300/63764) loss:2.983 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 19:41:04,772][model3_sft.py][INFO] Epoch:[1/2](52350/63764) loss:3.355 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 19:41:23,292][model3_sft.py][INFO] Epoch:[1/2](52400/63764) loss:3.334 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 19:41:41,751][model3_sft.py][INFO] Epoch:[1/2](52450/63764) loss:2.879 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 19:42:00,239][model3_sft.py][INFO] Epoch:[1/2](52500/63764) loss:3.196 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 19:42:18,924][model3_sft.py][INFO] Epoch:[1/2](52550/63764) loss:3.656 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 19:42:37,399][model3_sft.py][INFO] Epoch:[1/2](52600/63764) loss:2.752 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 19:42:55,871][model3_sft.py][INFO] Epoch:[1/2](52650/63764) loss:3.456 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 19:43:14,363][model3_sft.py][INFO] Epoch:[1/2](52700/63764) loss:3.428 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 19:43:32,846][model3_sft.py][INFO] Epoch:[1/2](52750/63764) loss:3.370 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 19:43:51,359][model3_sft.py][INFO] Epoch:[1/2](52800/63764) loss:2.981 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 19:44:09,852][model3_sft.py][INFO] Epoch:[1/2](52850/63764) loss:3.147 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 19:44:28,314][model3_sft.py][INFO] Epoch:[1/2](52900/63764) loss:3.258 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 19:44:46,725][model3_sft.py][INFO] Epoch:[1/2](52950/63764) loss:2.790 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 19:45:05,113][model3_sft.py][INFO] Epoch:[1/2](53000/63764) loss:3.139 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 19:45:23,533][model3_sft.py][INFO] Epoch:[1/2](53050/63764) loss:3.322 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 19:45:41,899][model3_sft.py][INFO] Epoch:[1/2](53100/63764) loss:3.569 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 19:46:00,279][model3_sft.py][INFO] Epoch:[1/2](53150/63764) loss:3.387 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 19:46:18,652][model3_sft.py][INFO] Epoch:[1/2](53200/63764) loss:3.111 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 19:46:37,073][model3_sft.py][INFO] Epoch:[1/2](53250/63764) loss:3.360 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 19:46:55,463][model3_sft.py][INFO] Epoch:[1/2](53300/63764) loss:2.941 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 19:47:13,851][model3_sft.py][INFO] Epoch:[1/2](53350/63764) loss:2.670 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 19:47:32,255][model3_sft.py][INFO] Epoch:[1/2](53400/63764) loss:2.786 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 19:47:50,691][model3_sft.py][INFO] Epoch:[1/2](53450/63764) loss:3.151 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 19:48:09,318][model3_sft.py][INFO] Epoch:[1/2](53500/63764) loss:3.546 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 19:48:27,733][model3_sft.py][INFO] Epoch:[1/2](53550/63764) loss:2.883 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 19:48:46,101][model3_sft.py][INFO] Epoch:[1/2](53600/63764) loss:3.440 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 19:49:04,508][model3_sft.py][INFO] Epoch:[1/2](53650/63764) loss:3.023 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 19:49:22,898][model3_sft.py][INFO] Epoch:[1/2](53700/63764) loss:3.923 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 19:49:41,279][model3_sft.py][INFO] Epoch:[1/2](53750/63764) loss:3.304 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 19:49:59,675][model3_sft.py][INFO] Epoch:[1/2](53800/63764) loss:3.046 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 19:50:18,085][model3_sft.py][INFO] Epoch:[1/2](53850/63764) loss:3.187 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 19:50:36,522][model3_sft.py][INFO] Epoch:[1/2](53900/63764) loss:3.059 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 19:50:54,972][model3_sft.py][INFO] Epoch:[1/2](53950/63764) loss:2.339 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 19:51:13,443][model3_sft.py][INFO] Epoch:[1/2](54000/63764) loss:2.753 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 19:51:31,837][model3_sft.py][INFO] Epoch:[1/2](54050/63764) loss:3.153 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 19:51:50,228][model3_sft.py][INFO] Epoch:[1/2](54100/63764) loss:3.364 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 19:52:08,611][model3_sft.py][INFO] Epoch:[1/2](54150/63764) loss:2.956 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 19:52:27,016][model3_sft.py][INFO] Epoch:[1/2](54200/63764) loss:3.844 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 19:52:45,421][model3_sft.py][INFO] Epoch:[1/2](54250/63764) loss:3.347 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 19:53:03,868][model3_sft.py][INFO] Epoch:[1/2](54300/63764) loss:3.215 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 19:53:22,219][model3_sft.py][INFO] Epoch:[1/2](54350/63764) loss:3.504 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 19:53:40,648][model3_sft.py][INFO] Epoch:[1/2](54400/63764) loss:3.539 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 19:53:59,344][model3_sft.py][INFO] Epoch:[1/2](54450/63764) loss:3.153 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 19:54:17,759][model3_sft.py][INFO] Epoch:[1/2](54500/63764) loss:2.940 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 19:54:36,158][model3_sft.py][INFO] Epoch:[1/2](54550/63764) loss:3.502 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 19:54:54,554][model3_sft.py][INFO] Epoch:[1/2](54600/63764) loss:2.935 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 19:55:12,967][model3_sft.py][INFO] Epoch:[1/2](54650/63764) loss:3.850 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 19:55:31,378][model3_sft.py][INFO] Epoch:[1/2](54700/63764) loss:3.370 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 19:55:49,745][model3_sft.py][INFO] Epoch:[1/2](54750/63764) loss:3.328 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 19:56:08,168][model3_sft.py][INFO] Epoch:[1/2](54800/63764) loss:3.367 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 19:56:26,571][model3_sft.py][INFO] Epoch:[1/2](54850/63764) loss:3.056 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 19:56:44,945][model3_sft.py][INFO] Epoch:[1/2](54900/63764) loss:3.684 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 19:57:03,333][model3_sft.py][INFO] Epoch:[1/2](54950/63764) loss:2.663 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 19:57:21,708][model3_sft.py][INFO] Epoch:[1/2](55000/63764) loss:3.308 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 19:57:40,079][model3_sft.py][INFO] Epoch:[1/2](55050/63764) loss:3.163 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 19:57:58,462][model3_sft.py][INFO] Epoch:[1/2](55100/63764) loss:3.428 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 19:58:16,849][model3_sft.py][INFO] Epoch:[1/2](55150/63764) loss:3.509 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 19:58:35,354][model3_sft.py][INFO] Epoch:[1/2](55200/63764) loss:2.728 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 19:58:53,784][model3_sft.py][INFO] Epoch:[1/2](55250/63764) loss:3.325 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 19:59:12,244][model3_sft.py][INFO] Epoch:[1/2](55300/63764) loss:2.905 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 19:59:30,664][model3_sft.py][INFO] Epoch:[1/2](55350/63764) loss:3.419 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 19:59:49,284][model3_sft.py][INFO] Epoch:[1/2](55400/63764) loss:3.429 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 20:00:07,720][model3_sft.py][INFO] Epoch:[1/2](55450/63764) loss:2.987 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 20:00:26,092][model3_sft.py][INFO] Epoch:[1/2](55500/63764) loss:2.833 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 20:00:44,530][model3_sft.py][INFO] Epoch:[1/2](55550/63764) loss:3.299 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 20:01:02,923][model3_sft.py][INFO] Epoch:[1/2](55600/63764) loss:3.292 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 20:01:21,331][model3_sft.py][INFO] Epoch:[1/2](55650/63764) loss:3.443 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 20:01:39,745][model3_sft.py][INFO] Epoch:[1/2](55700/63764) loss:3.345 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 20:01:58,138][model3_sft.py][INFO] Epoch:[1/2](55750/63764) loss:3.361 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 20:02:16,527][model3_sft.py][INFO] Epoch:[1/2](55800/63764) loss:4.001 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 20:02:34,923][model3_sft.py][INFO] Epoch:[1/2](55850/63764) loss:3.122 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 20:02:53,378][model3_sft.py][INFO] Epoch:[1/2](55900/63764) loss:2.973 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 20:03:11,812][model3_sft.py][INFO] Epoch:[1/2](55950/63764) loss:3.025 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 20:03:30,226][model3_sft.py][INFO] Epoch:[1/2](56000/63764) loss:3.216 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 20:03:48,635][model3_sft.py][INFO] Epoch:[1/2](56050/63764) loss:3.080 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 20:04:07,014][model3_sft.py][INFO] Epoch:[1/2](56100/63764) loss:3.367 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 20:04:25,424][model3_sft.py][INFO] Epoch:[1/2](56150/63764) loss:3.132 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 20:04:43,856][model3_sft.py][INFO] Epoch:[1/2](56200/63764) loss:2.992 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 20:05:02,306][model3_sft.py][INFO] Epoch:[1/2](56250/63764) loss:3.392 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 20:05:20,696][model3_sft.py][INFO] Epoch:[1/2](56300/63764) loss:3.258 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 20:05:39,087][model3_sft.py][INFO] Epoch:[1/2](56350/63764) loss:2.993 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 20:05:57,729][model3_sft.py][INFO] Epoch:[1/2](56400/63764) loss:3.670 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 20:06:16,148][model3_sft.py][INFO] Epoch:[1/2](56450/63764) loss:3.628 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 20:06:34,560][model3_sft.py][INFO] Epoch:[1/2](56500/63764) loss:2.839 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 20:06:52,991][model3_sft.py][INFO] Epoch:[1/2](56550/63764) loss:3.485 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 20:07:11,411][model3_sft.py][INFO] Epoch:[1/2](56600/63764) loss:2.687 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 20:07:29,850][model3_sft.py][INFO] Epoch:[1/2](56650/63764) loss:2.653 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 20:07:48,300][model3_sft.py][INFO] Epoch:[1/2](56700/63764) loss:3.051 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 20:08:06,694][model3_sft.py][INFO] Epoch:[1/2](56750/63764) loss:3.694 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 20:08:25,095][model3_sft.py][INFO] Epoch:[1/2](56800/63764) loss:2.946 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 20:08:43,545][model3_sft.py][INFO] Epoch:[1/2](56850/63764) loss:3.794 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 20:09:01,935][model3_sft.py][INFO] Epoch:[1/2](56900/63764) loss:3.363 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 20:09:20,343][model3_sft.py][INFO] Epoch:[1/2](56950/63764) loss:3.050 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 20:09:38,791][model3_sft.py][INFO] Epoch:[1/2](57000/63764) loss:2.540 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 20:09:57,277][model3_sft.py][INFO] Epoch:[1/2](57050/63764) loss:3.489 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 20:10:15,698][model3_sft.py][INFO] Epoch:[1/2](57100/63764) loss:2.827 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 20:10:34,155][model3_sft.py][INFO] Epoch:[1/2](57150/63764) loss:2.753 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 20:10:52,588][model3_sft.py][INFO] Epoch:[1/2](57200/63764) loss:3.552 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 20:11:10,994][model3_sft.py][INFO] Epoch:[1/2](57250/63764) loss:3.775 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 20:11:29,420][model3_sft.py][INFO] Epoch:[1/2](57300/63764) loss:3.995 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 20:11:48,165][model3_sft.py][INFO] Epoch:[1/2](57350/63764) loss:3.123 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 20:12:06,562][model3_sft.py][INFO] Epoch:[1/2](57400/63764) loss:2.764 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 20:12:25,019][model3_sft.py][INFO] Epoch:[1/2](57450/63764) loss:3.095 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 20:12:43,437][model3_sft.py][INFO] Epoch:[1/2](57500/63764) loss:3.103 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 20:13:01,882][model3_sft.py][INFO] Epoch:[1/2](57550/63764) loss:3.357 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 20:13:20,314][model3_sft.py][INFO] Epoch:[1/2](57600/63764) loss:3.289 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 20:13:38,756][model3_sft.py][INFO] Epoch:[1/2](57650/63764) loss:3.402 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 20:13:57,197][model3_sft.py][INFO] Epoch:[1/2](57700/63764) loss:2.913 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 20:14:15,668][model3_sft.py][INFO] Epoch:[1/2](57750/63764) loss:3.171 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 20:14:34,125][model3_sft.py][INFO] Epoch:[1/2](57800/63764) loss:3.128 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 20:14:52,559][model3_sft.py][INFO] Epoch:[1/2](57850/63764) loss:2.883 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 20:15:10,999][model3_sft.py][INFO] Epoch:[1/2](57900/63764) loss:3.213 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 20:15:29,471][model3_sft.py][INFO] Epoch:[1/2](57950/63764) loss:2.804 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 20:15:47,883][model3_sft.py][INFO] Epoch:[1/2](58000/63764) loss:3.230 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 20:16:06,250][model3_sft.py][INFO] Epoch:[1/2](58050/63764) loss:4.398 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 20:16:24,643][model3_sft.py][INFO] Epoch:[1/2](58100/63764) loss:3.663 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 20:16:43,074][model3_sft.py][INFO] Epoch:[1/2](58150/63764) loss:3.032 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 20:17:01,472][model3_sft.py][INFO] Epoch:[1/2](58200/63764) loss:3.156 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 20:17:19,891][model3_sft.py][INFO] Epoch:[1/2](58250/63764) loss:3.745 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 20:17:38,511][model3_sft.py][INFO] Epoch:[1/2](58300/63764) loss:3.528 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 20:17:56,890][model3_sft.py][INFO] Epoch:[1/2](58350/63764) loss:3.747 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 20:18:15,373][model3_sft.py][INFO] Epoch:[1/2](58400/63764) loss:3.359 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 20:18:33,803][model3_sft.py][INFO] Epoch:[1/2](58450/63764) loss:2.562 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 20:18:52,245][model3_sft.py][INFO] Epoch:[1/2](58500/63764) loss:3.392 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 20:19:10,676][model3_sft.py][INFO] Epoch:[1/2](58550/63764) loss:3.143 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 20:19:29,095][model3_sft.py][INFO] Epoch:[1/2](58600/63764) loss:3.248 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 20:19:47,524][model3_sft.py][INFO] Epoch:[1/2](58650/63764) loss:2.816 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 20:20:05,949][model3_sft.py][INFO] Epoch:[1/2](58700/63764) loss:3.266 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 20:20:24,426][model3_sft.py][INFO] Epoch:[1/2](58750/63764) loss:3.357 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 20:20:42,844][model3_sft.py][INFO] Epoch:[1/2](58800/63764) loss:3.823 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 20:21:01,254][model3_sft.py][INFO] Epoch:[1/2](58850/63764) loss:2.820 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 20:21:19,708][model3_sft.py][INFO] Epoch:[1/2](58900/63764) loss:3.516 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 20:21:38,137][model3_sft.py][INFO] Epoch:[1/2](58950/63764) loss:3.631 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 20:21:56,543][model3_sft.py][INFO] Epoch:[1/2](59000/63764) loss:3.715 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 20:22:15,011][model3_sft.py][INFO] Epoch:[1/2](59050/63764) loss:3.476 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 20:22:33,461][model3_sft.py][INFO] Epoch:[1/2](59100/63764) loss:2.883 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 20:22:51,887][model3_sft.py][INFO] Epoch:[1/2](59150/63764) loss:3.984 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 20:23:10,309][model3_sft.py][INFO] Epoch:[1/2](59200/63764) loss:3.294 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 20:23:28,987][model3_sft.py][INFO] Epoch:[1/2](59250/63764) loss:3.057 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 20:23:47,437][model3_sft.py][INFO] Epoch:[1/2](59300/63764) loss:3.148 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 20:24:05,861][model3_sft.py][INFO] Epoch:[1/2](59350/63764) loss:2.814 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 20:24:24,270][model3_sft.py][INFO] Epoch:[1/2](59400/63764) loss:3.164 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 20:24:42,752][model3_sft.py][INFO] Epoch:[1/2](59450/63764) loss:3.071 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 20:25:01,187][model3_sft.py][INFO] Epoch:[1/2](59500/63764) loss:3.424 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 20:25:19,625][model3_sft.py][INFO] Epoch:[1/2](59550/63764) loss:3.045 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 20:25:38,053][model3_sft.py][INFO] Epoch:[1/2](59600/63764) loss:3.529 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 20:25:56,441][model3_sft.py][INFO] Epoch:[1/2](59650/63764) loss:2.945 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 20:26:14,897][model3_sft.py][INFO] Epoch:[1/2](59700/63764) loss:3.459 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 20:26:33,306][model3_sft.py][INFO] Epoch:[1/2](59750/63764) loss:3.815 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 20:26:51,750][model3_sft.py][INFO] Epoch:[1/2](59800/63764) loss:3.454 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 20:27:10,219][model3_sft.py][INFO] Epoch:[1/2](59850/63764) loss:2.421 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 20:27:28,623][model3_sft.py][INFO] Epoch:[1/2](59900/63764) loss:2.778 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 20:27:47,086][model3_sft.py][INFO] Epoch:[1/2](59950/63764) loss:3.133 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 20:28:05,538][model3_sft.py][INFO] Epoch:[1/2](60000/63764) loss:3.429 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 20:28:23,996][model3_sft.py][INFO] Epoch:[1/2](60050/63764) loss:3.768 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 20:28:42,471][model3_sft.py][INFO] Epoch:[1/2](60100/63764) loss:3.064 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 20:29:00,947][model3_sft.py][INFO] Epoch:[1/2](60150/63764) loss:3.146 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 20:29:19,607][model3_sft.py][INFO] Epoch:[1/2](60200/63764) loss:3.572 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 20:29:38,083][model3_sft.py][INFO] Epoch:[1/2](60250/63764) loss:3.357 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 20:29:56,536][model3_sft.py][INFO] Epoch:[1/2](60300/63764) loss:3.129 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 20:30:15,019][model3_sft.py][INFO] Epoch:[1/2](60350/63764) loss:3.209 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 20:30:33,435][model3_sft.py][INFO] Epoch:[1/2](60400/63764) loss:3.212 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 20:30:51,893][model3_sft.py][INFO] Epoch:[1/2](60450/63764) loss:3.010 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 20:31:10,330][model3_sft.py][INFO] Epoch:[1/2](60500/63764) loss:3.965 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 20:31:28,820][model3_sft.py][INFO] Epoch:[1/2](60550/63764) loss:2.923 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 20:31:47,329][model3_sft.py][INFO] Epoch:[1/2](60600/63764) loss:2.586 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 20:32:05,737][model3_sft.py][INFO] Epoch:[1/2](60650/63764) loss:3.563 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 20:32:24,153][model3_sft.py][INFO] Epoch:[1/2](60700/63764) loss:2.599 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 20:32:42,546][model3_sft.py][INFO] Epoch:[1/2](60750/63764) loss:3.358 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 20:33:00,972][model3_sft.py][INFO] Epoch:[1/2](60800/63764) loss:3.357 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 20:33:19,384][model3_sft.py][INFO] Epoch:[1/2](60850/63764) loss:3.419 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 20:33:37,788][model3_sft.py][INFO] Epoch:[1/2](60900/63764) loss:2.217 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 20:33:56,254][model3_sft.py][INFO] Epoch:[1/2](60950/63764) loss:2.982 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 20:34:14,652][model3_sft.py][INFO] Epoch:[1/2](61000/63764) loss:2.930 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 20:34:33,062][model3_sft.py][INFO] Epoch:[1/2](61050/63764) loss:2.899 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 20:34:51,460][model3_sft.py][INFO] Epoch:[1/2](61100/63764) loss:3.256 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 20:35:09,923][model3_sft.py][INFO] Epoch:[1/2](61150/63764) loss:3.742 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 20:35:28,519][model3_sft.py][INFO] Epoch:[1/2](61200/63764) loss:2.890 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 20:35:46,933][model3_sft.py][INFO] Epoch:[1/2](61250/63764) loss:3.170 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 20:36:05,398][model3_sft.py][INFO] Epoch:[1/2](61300/63764) loss:3.094 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 20:36:23,829][model3_sft.py][INFO] Epoch:[1/2](61350/63764) loss:2.638 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 20:36:42,268][model3_sft.py][INFO] Epoch:[1/2](61400/63764) loss:2.617 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 20:37:00,743][model3_sft.py][INFO] Epoch:[1/2](61450/63764) loss:3.082 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 20:37:19,118][model3_sft.py][INFO] Epoch:[1/2](61500/63764) loss:3.630 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 20:37:37,527][model3_sft.py][INFO] Epoch:[1/2](61550/63764) loss:3.068 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 20:37:55,970][model3_sft.py][INFO] Epoch:[1/2](61600/63764) loss:4.128 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 20:38:14,442][model3_sft.py][INFO] Epoch:[1/2](61650/63764) loss:3.456 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 20:38:32,873][model3_sft.py][INFO] Epoch:[1/2](61700/63764) loss:3.022 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 20:38:51,299][model3_sft.py][INFO] Epoch:[1/2](61750/63764) loss:3.648 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 20:39:09,694][model3_sft.py][INFO] Epoch:[1/2](61800/63764) loss:3.438 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 20:39:28,101][model3_sft.py][INFO] Epoch:[1/2](61850/63764) loss:3.146 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 20:39:46,532][model3_sft.py][INFO] Epoch:[1/2](61900/63764) loss:3.366 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 20:40:04,959][model3_sft.py][INFO] Epoch:[1/2](61950/63764) loss:3.344 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 20:40:23,334][model3_sft.py][INFO] Epoch:[1/2](62000/63764) loss:3.390 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 20:40:41,726][model3_sft.py][INFO] Epoch:[1/2](62050/63764) loss:3.522 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 20:41:00,111][model3_sft.py][INFO] Epoch:[1/2](62100/63764) loss:3.198 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 20:41:18,691][model3_sft.py][INFO] Epoch:[1/2](62150/63764) loss:3.189 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 20:41:37,060][model3_sft.py][INFO] Epoch:[1/2](62200/63764) loss:2.978 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 20:41:55,551][model3_sft.py][INFO] Epoch:[1/2](62250/63764) loss:2.947 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 20:42:14,006][model3_sft.py][INFO] Epoch:[1/2](62300/63764) loss:3.707 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 20:42:32,380][model3_sft.py][INFO] Epoch:[1/2](62350/63764) loss:3.762 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 20:42:50,799][model3_sft.py][INFO] Epoch:[1/2](62400/63764) loss:3.444 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 20:43:09,199][model3_sft.py][INFO] Epoch:[1/2](62450/63764) loss:3.116 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 20:43:27,650][model3_sft.py][INFO] Epoch:[1/2](62500/63764) loss:2.605 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 20:43:46,074][model3_sft.py][INFO] Epoch:[1/2](62550/63764) loss:2.872 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 20:44:04,499][model3_sft.py][INFO] Epoch:[1/2](62600/63764) loss:3.693 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 20:44:22,845][model3_sft.py][INFO] Epoch:[1/2](62650/63764) loss:3.625 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 20:44:41,255][model3_sft.py][INFO] Epoch:[1/2](62700/63764) loss:2.931 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 20:44:59,627][model3_sft.py][INFO] Epoch:[1/2](62750/63764) loss:3.281 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 20:45:18,012][model3_sft.py][INFO] Epoch:[1/2](62800/63764) loss:3.690 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 20:45:36,409][model3_sft.py][INFO] Epoch:[1/2](62850/63764) loss:3.103 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 20:45:54,820][model3_sft.py][INFO] Epoch:[1/2](62900/63764) loss:2.896 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 20:46:13,239][model3_sft.py][INFO] Epoch:[1/2](62950/63764) loss:3.004 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 20:46:31,612][model3_sft.py][INFO] Epoch:[1/2](63000/63764) loss:2.756 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 20:46:50,019][model3_sft.py][INFO] Epoch:[1/2](63050/63764) loss:3.172 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 20:47:08,657][model3_sft.py][INFO] Epoch:[1/2](63100/63764) loss:3.501 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 20:47:27,041][model3_sft.py][INFO] Epoch:[1/2](63150/63764) loss:2.898 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 20:47:45,464][model3_sft.py][INFO] Epoch:[1/2](63200/63764) loss:2.803 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 20:48:03,866][model3_sft.py][INFO] Epoch:[1/2](63250/63764) loss:3.848 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 20:48:22,316][model3_sft.py][INFO] Epoch:[1/2](63300/63764) loss:3.709 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 20:48:40,688][model3_sft.py][INFO] Epoch:[1/2](63350/63764) loss:2.773 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 20:48:59,126][model3_sft.py][INFO] Epoch:[1/2](63400/63764) loss:3.334 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 20:49:17,561][model3_sft.py][INFO] Epoch:[1/2](63450/63764) loss:3.398 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 20:49:35,982][model3_sft.py][INFO] Epoch:[1/2](63500/63764) loss:3.766 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 20:49:54,378][model3_sft.py][INFO] Epoch:[1/2](63550/63764) loss:3.453 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 20:50:12,824][model3_sft.py][INFO] Epoch:[1/2](63600/63764) loss:2.887 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 20:50:31,244][model3_sft.py][INFO] Epoch:[1/2](63650/63764) loss:3.222 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 20:50:49,639][model3_sft.py][INFO] Epoch:[1/2](63700/63764) loss:3.245 lr:0.0000010 epoch_Time:0.0min: [2023-12-11 20:51:08,052][model3_sft.py][INFO] Epoch:[1/2](63750/63764) loss:3.940 lr:0.0000010 epoch_Time:0.0min: