[2023-12-11 07:43:26,005][model2_sft.py][INFO] Epoch:[0/2](0/63764) loss:3.378 lr:0.0000000 epoch_Time:902.0min: [2023-12-11 07:43:37,053][model2_sft.py][INFO] Epoch:[0/2](50/63764) loss:3.163 lr:0.0000010 epoch_Time:247.0min: [2023-12-11 07:43:48,145][model2_sft.py][INFO] Epoch:[0/2](100/63764) loss:3.387 lr:0.0000020 epoch_Time:241.0min: [2023-12-11 07:43:59,227][model2_sft.py][INFO] Epoch:[0/2](150/63764) loss:3.535 lr:0.0000030 epoch_Time:239.0min: [2023-12-11 07:44:10,253][model2_sft.py][INFO] Epoch:[0/2](200/63764) loss:2.672 lr:0.0000040 epoch_Time:238.0min: [2023-12-11 07:44:21,320][model2_sft.py][INFO] Epoch:[0/2](250/63764) loss:3.066 lr:0.0000050 epoch_Time:237.0min: [2023-12-11 07:44:32,341][model2_sft.py][INFO] Epoch:[0/2](300/63764) loss:3.576 lr:0.0000060 epoch_Time:236.0min: [2023-12-11 07:44:43,394][model2_sft.py][INFO] Epoch:[0/2](350/63764) loss:3.124 lr:0.0000070 epoch_Time:235.0min: [2023-12-11 07:44:54,399][model2_sft.py][INFO] Epoch:[0/2](400/63764) loss:2.683 lr:0.0000080 epoch_Time:235.0min: [2023-12-11 07:45:05,414][model2_sft.py][INFO] Epoch:[0/2](450/63764) loss:2.826 lr:0.0000090 epoch_Time:235.0min: [2023-12-11 07:45:16,388][model2_sft.py][INFO] Epoch:[0/2](500/63764) loss:3.277 lr:0.0000100 epoch_Time:234.0min: [2023-12-11 07:45:27,432][model2_sft.py][INFO] Epoch:[0/2](550/63764) loss:3.287 lr:0.0000110 epoch_Time:233.0min: [2023-12-11 07:45:38,472][model2_sft.py][INFO] Epoch:[0/2](600/63764) loss:3.120 lr:0.0000120 epoch_Time:233.0min: [2023-12-11 07:45:49,498][model2_sft.py][INFO] Epoch:[0/2](650/63764) loss:3.306 lr:0.0000130 epoch_Time:233.0min: [2023-12-11 07:46:00,488][model2_sft.py][INFO] Epoch:[0/2](700/63764) loss:3.274 lr:0.0000140 epoch_Time:233.0min: [2023-12-11 07:46:11,621][model2_sft.py][INFO] Epoch:[0/2](750/63764) loss:2.992 lr:0.0000150 epoch_Time:233.0min: [2023-12-11 07:46:22,634][model2_sft.py][INFO] Epoch:[0/2](800/63764) loss:3.121 lr:0.0000160 epoch_Time:233.0min: [2023-12-11 07:46:33,668][model2_sft.py][INFO] Epoch:[0/2](850/63764) loss:3.300 lr:0.0000170 epoch_Time:232.0min: [2023-12-11 07:46:44,713][model2_sft.py][INFO] Epoch:[0/2](900/63764) loss:3.647 lr:0.0000180 epoch_Time:232.0min: [2023-12-11 07:46:55,775][model2_sft.py][INFO] Epoch:[0/2](950/63764) loss:3.619 lr:0.0000190 epoch_Time:232.0min: [2023-12-11 07:47:06,827][model2_sft.py][INFO] Epoch:[0/2](1000/63764) loss:3.326 lr:0.0000200 epoch_Time:232.0min: [2023-12-11 07:47:17,849][model2_sft.py][INFO] Epoch:[0/2](1050/63764) loss:2.702 lr:0.0000200 epoch_Time:232.0min: [2023-12-11 07:47:28,936][model2_sft.py][INFO] Epoch:[0/2](1100/63764) loss:3.783 lr:0.0000200 epoch_Time:231.0min: [2023-12-11 07:47:39,968][model2_sft.py][INFO] Epoch:[0/2](1150/63764) loss:3.085 lr:0.0000200 epoch_Time:231.0min: [2023-12-11 07:47:50,941][model2_sft.py][INFO] Epoch:[0/2](1200/63764) loss:3.101 lr:0.0000200 epoch_Time:231.0min: [2023-12-11 07:48:01,940][model2_sft.py][INFO] Epoch:[0/2](1250/63764) loss:3.317 lr:0.0000200 epoch_Time:231.0min: [2023-12-11 07:48:12,985][model2_sft.py][INFO] Epoch:[0/2](1300/63764) loss:3.489 lr:0.0000200 epoch_Time:231.0min: [2023-12-11 07:48:24,037][model2_sft.py][INFO] Epoch:[0/2](1350/63764) loss:3.307 lr:0.0000200 epoch_Time:231.0min: [2023-12-11 07:48:35,094][model2_sft.py][INFO] Epoch:[0/2](1400/63764) loss:3.671 lr:0.0000200 epoch_Time:230.0min: [2023-12-11 07:48:46,112][model2_sft.py][INFO] Epoch:[0/2](1450/63764) loss:3.441 lr:0.0000200 epoch_Time:230.0min: [2023-12-11 07:48:57,151][model2_sft.py][INFO] Epoch:[0/2](1500/63764) loss:3.449 lr:0.0000200 epoch_Time:230.0min: [2023-12-11 07:49:08,126][model2_sft.py][INFO] Epoch:[0/2](1550/63764) loss:3.314 lr:0.0000200 epoch_Time:229.0min: [2023-12-11 07:49:19,136][model2_sft.py][INFO] Epoch:[0/2](1600/63764) loss:2.962 lr:0.0000200 epoch_Time:229.0min: [2023-12-11 07:49:30,144][model2_sft.py][INFO] Epoch:[0/2](1650/63764) loss:3.006 lr:0.0000200 epoch_Time:228.0min: [2023-12-11 07:49:41,135][model2_sft.py][INFO] Epoch:[0/2](1700/63764) loss:2.742 lr:0.0000200 epoch_Time:228.0min: [2023-12-11 07:49:52,163][model2_sft.py][INFO] Epoch:[0/2](1750/63764) loss:3.776 lr:0.0000200 epoch_Time:228.0min: [2023-12-11 07:50:03,166][model2_sft.py][INFO] Epoch:[0/2](1800/63764) loss:3.401 lr:0.0000200 epoch_Time:228.0min: [2023-12-11 07:50:14,225][model2_sft.py][INFO] Epoch:[0/2](1850/63764) loss:3.506 lr:0.0000200 epoch_Time:228.0min: [2023-12-11 07:50:25,256][model2_sft.py][INFO] Epoch:[0/2](1900/63764) loss:4.175 lr:0.0000200 epoch_Time:227.0min: [2023-12-11 07:50:36,284][model2_sft.py][INFO] Epoch:[0/2](1950/63764) loss:2.881 lr:0.0000200 epoch_Time:227.0min: [2023-12-11 07:50:47,273][model2_sft.py][INFO] Epoch:[0/2](2000/63764) loss:2.960 lr:0.0000200 epoch_Time:227.0min: [2023-12-11 07:50:58,273][model2_sft.py][INFO] Epoch:[0/2](2050/63764) loss:3.508 lr:0.0000200 epoch_Time:227.0min: [2023-12-11 07:51:09,255][model2_sft.py][INFO] Epoch:[0/2](2100/63764) loss:3.806 lr:0.0000200 epoch_Time:227.0min: [2023-12-11 07:51:20,265][model2_sft.py][INFO] Epoch:[0/2](2150/63764) loss:3.495 lr:0.0000200 epoch_Time:227.0min: [2023-12-11 07:51:31,253][model2_sft.py][INFO] Epoch:[0/2](2200/63764) loss:3.998 lr:0.0000200 epoch_Time:226.0min: [2023-12-11 07:51:42,266][model2_sft.py][INFO] Epoch:[0/2](2250/63764) loss:3.058 lr:0.0000200 epoch_Time:226.0min: [2023-12-11 07:51:53,272][model2_sft.py][INFO] Epoch:[0/2](2300/63764) loss:3.545 lr:0.0000200 epoch_Time:226.0min: [2023-12-11 07:52:04,293][model2_sft.py][INFO] Epoch:[0/2](2350/63764) loss:4.198 lr:0.0000200 epoch_Time:226.0min: [2023-12-11 07:52:15,310][model2_sft.py][INFO] Epoch:[0/2](2400/63764) loss:3.125 lr:0.0000200 epoch_Time:226.0min: [2023-12-11 07:52:26,320][model2_sft.py][INFO] Epoch:[0/2](2450/63764) loss:3.548 lr:0.0000200 epoch_Time:225.0min: [2023-12-11 07:52:37,308][model2_sft.py][INFO] Epoch:[0/2](2500/63764) loss:3.671 lr:0.0000200 epoch_Time:225.0min: [2023-12-11 07:52:48,322][model2_sft.py][INFO] Epoch:[0/2](2550/63764) loss:2.777 lr:0.0000200 epoch_Time:225.0min: [2023-12-11 07:52:59,334][model2_sft.py][INFO] Epoch:[0/2](2600/63764) loss:3.575 lr:0.0000200 epoch_Time:225.0min: [2023-12-11 07:53:10,344][model2_sft.py][INFO] Epoch:[0/2](2650/63764) loss:3.301 lr:0.0000199 epoch_Time:225.0min: [2023-12-11 07:53:21,347][model2_sft.py][INFO] Epoch:[0/2](2700/63764) loss:3.733 lr:0.0000199 epoch_Time:225.0min: [2023-12-11 07:53:32,349][model2_sft.py][INFO] Epoch:[0/2](2750/63764) loss:3.748 lr:0.0000199 epoch_Time:224.0min: [2023-12-11 07:53:43,366][model2_sft.py][INFO] Epoch:[0/2](2800/63764) loss:3.053 lr:0.0000199 epoch_Time:224.0min: [2023-12-11 07:53:54,427][model2_sft.py][INFO] Epoch:[0/2](2850/63764) loss:2.732 lr:0.0000199 epoch_Time:224.0min: [2023-12-11 07:54:05,451][model2_sft.py][INFO] Epoch:[0/2](2900/63764) loss:3.802 lr:0.0000199 epoch_Time:224.0min: [2023-12-11 07:54:16,468][model2_sft.py][INFO] Epoch:[0/2](2950/63764) loss:3.859 lr:0.0000199 epoch_Time:224.0min: [2023-12-11 07:54:27,497][model2_sft.py][INFO] Epoch:[0/2](3000/63764) loss:3.510 lr:0.0000199 epoch_Time:223.0min: [2023-12-11 07:54:38,570][model2_sft.py][INFO] Epoch:[0/2](3050/63764) loss:3.788 lr:0.0000199 epoch_Time:223.0min: [2023-12-11 07:54:49,608][model2_sft.py][INFO] Epoch:[0/2](3100/63764) loss:3.310 lr:0.0000199 epoch_Time:223.0min: [2023-12-11 07:55:00,665][model2_sft.py][INFO] Epoch:[0/2](3150/63764) loss:3.808 lr:0.0000199 epoch_Time:223.0min: [2023-12-11 07:55:11,696][model2_sft.py][INFO] Epoch:[0/2](3200/63764) loss:3.188 lr:0.0000199 epoch_Time:223.0min: [2023-12-11 07:55:22,752][model2_sft.py][INFO] Epoch:[0/2](3250/63764) loss:3.915 lr:0.0000199 epoch_Time:223.0min: [2023-12-11 07:55:33,838][model2_sft.py][INFO] Epoch:[0/2](3300/63764) loss:3.094 lr:0.0000199 epoch_Time:222.0min: [2023-12-11 07:55:44,916][model2_sft.py][INFO] Epoch:[0/2](3350/63764) loss:4.283 lr:0.0000199 epoch_Time:222.0min: [2023-12-11 07:55:56,007][model2_sft.py][INFO] Epoch:[0/2](3400/63764) loss:3.429 lr:0.0000199 epoch_Time:222.0min: [2023-12-11 07:56:07,068][model2_sft.py][INFO] Epoch:[0/2](3450/63764) loss:3.625 lr:0.0000199 epoch_Time:222.0min: [2023-12-11 07:56:18,122][model2_sft.py][INFO] Epoch:[0/2](3500/63764) loss:3.602 lr:0.0000199 epoch_Time:222.0min: [2023-12-11 07:56:29,143][model2_sft.py][INFO] Epoch:[0/2](3550/63764) loss:3.216 lr:0.0000199 epoch_Time:221.0min: [2023-12-11 07:56:40,196][model2_sft.py][INFO] Epoch:[0/2](3600/63764) loss:3.469 lr:0.0000199 epoch_Time:221.0min: [2023-12-11 07:56:51,235][model2_sft.py][INFO] Epoch:[0/2](3650/63764) loss:3.331 lr:0.0000199 epoch_Time:221.0min: [2023-12-11 07:57:02,275][model2_sft.py][INFO] Epoch:[0/2](3700/63764) loss:3.183 lr:0.0000199 epoch_Time:221.0min: [2023-12-11 07:57:13,353][model2_sft.py][INFO] Epoch:[0/2](3750/63764) loss:4.033 lr:0.0000199 epoch_Time:221.0min: [2023-12-11 07:57:24,425][model2_sft.py][INFO] Epoch:[0/2](3800/63764) loss:3.629 lr:0.0000198 epoch_Time:221.0min: [2023-12-11 07:57:35,467][model2_sft.py][INFO] Epoch:[0/2](3850/63764) loss:3.771 lr:0.0000198 epoch_Time:220.0min: [2023-12-11 07:57:46,465][model2_sft.py][INFO] Epoch:[0/2](3900/63764) loss:3.744 lr:0.0000198 epoch_Time:220.0min: [2023-12-11 07:57:57,461][model2_sft.py][INFO] Epoch:[0/2](3950/63764) loss:3.330 lr:0.0000198 epoch_Time:220.0min: [2023-12-11 07:58:08,440][model2_sft.py][INFO] Epoch:[0/2](4000/63764) loss:3.469 lr:0.0000198 epoch_Time:220.0min: [2023-12-11 07:58:19,456][model2_sft.py][INFO] Epoch:[0/2](4050/63764) loss:4.082 lr:0.0000198 epoch_Time:220.0min: [2023-12-11 07:58:30,473][model2_sft.py][INFO] Epoch:[0/2](4100/63764) loss:3.471 lr:0.0000198 epoch_Time:219.0min: [2023-12-11 07:58:41,489][model2_sft.py][INFO] Epoch:[0/2](4150/63764) loss:4.013 lr:0.0000198 epoch_Time:219.0min: [2023-12-11 07:58:52,510][model2_sft.py][INFO] Epoch:[0/2](4200/63764) loss:3.777 lr:0.0000198 epoch_Time:219.0min: [2023-12-11 07:59:03,510][model2_sft.py][INFO] Epoch:[0/2](4250/63764) loss:3.947 lr:0.0000198 epoch_Time:219.0min: [2023-12-11 07:59:14,503][model2_sft.py][INFO] Epoch:[0/2](4300/63764) loss:3.617 lr:0.0000198 epoch_Time:219.0min: [2023-12-11 07:59:25,530][model2_sft.py][INFO] Epoch:[0/2](4350/63764) loss:4.374 lr:0.0000198 epoch_Time:218.0min: [2023-12-11 07:59:36,544][model2_sft.py][INFO] Epoch:[0/2](4400/63764) loss:3.610 lr:0.0000198 epoch_Time:218.0min: [2023-12-11 07:59:47,526][model2_sft.py][INFO] Epoch:[0/2](4450/63764) loss:3.503 lr:0.0000198 epoch_Time:218.0min: [2023-12-11 07:59:58,528][model2_sft.py][INFO] Epoch:[0/2](4500/63764) loss:3.055 lr:0.0000198 epoch_Time:218.0min: [2023-12-11 08:00:09,506][model2_sft.py][INFO] Epoch:[0/2](4550/63764) loss:3.942 lr:0.0000198 epoch_Time:218.0min: [2023-12-11 08:00:20,547][model2_sft.py][INFO] Epoch:[0/2](4600/63764) loss:3.199 lr:0.0000197 epoch_Time:218.0min: [2023-12-11 08:00:31,556][model2_sft.py][INFO] Epoch:[0/2](4650/63764) loss:3.587 lr:0.0000197 epoch_Time:217.0min: [2023-12-11 08:00:42,613][model2_sft.py][INFO] Epoch:[0/2](4700/63764) loss:3.447 lr:0.0000197 epoch_Time:217.0min: [2023-12-11 08:00:53,694][model2_sft.py][INFO] Epoch:[0/2](4750/63764) loss:3.347 lr:0.0000197 epoch_Time:217.0min: [2023-12-11 08:01:04,757][model2_sft.py][INFO] Epoch:[0/2](4800/63764) loss:3.101 lr:0.0000197 epoch_Time:217.0min: [2023-12-11 08:01:15,873][model2_sft.py][INFO] Epoch:[0/2](4850/63764) loss:3.352 lr:0.0000197 epoch_Time:217.0min: [2023-12-11 08:01:26,885][model2_sft.py][INFO] Epoch:[0/2](4900/63764) loss:3.686 lr:0.0000197 epoch_Time:216.0min: [2023-12-11 08:01:37,891][model2_sft.py][INFO] Epoch:[0/2](4950/63764) loss:4.065 lr:0.0000197 epoch_Time:216.0min: [2023-12-11 08:01:48,909][model2_sft.py][INFO] Epoch:[0/2](5000/63764) loss:4.018 lr:0.0000197 epoch_Time:216.0min: [2023-12-11 08:01:59,970][model2_sft.py][INFO] Epoch:[0/2](5050/63764) loss:3.427 lr:0.0000197 epoch_Time:216.0min: [2023-12-11 08:02:10,953][model2_sft.py][INFO] Epoch:[0/2](5100/63764) loss:4.142 lr:0.0000197 epoch_Time:216.0min: [2023-12-11 08:02:21,955][model2_sft.py][INFO] Epoch:[0/2](5150/63764) loss:3.638 lr:0.0000197 epoch_Time:216.0min: [2023-12-11 08:02:32,949][model2_sft.py][INFO] Epoch:[0/2](5200/63764) loss:3.570 lr:0.0000197 epoch_Time:215.0min: [2023-12-11 08:02:43,939][model2_sft.py][INFO] Epoch:[0/2](5250/63764) loss:3.652 lr:0.0000196 epoch_Time:215.0min: [2023-12-11 08:02:54,962][model2_sft.py][INFO] Epoch:[0/2](5300/63764) loss:2.957 lr:0.0000196 epoch_Time:215.0min: [2023-12-11 08:03:05,993][model2_sft.py][INFO] Epoch:[0/2](5350/63764) loss:3.784 lr:0.0000196 epoch_Time:215.0min: [2023-12-11 08:03:16,948][model2_sft.py][INFO] Epoch:[0/2](5400/63764) loss:3.777 lr:0.0000196 epoch_Time:215.0min: [2023-12-11 08:03:27,942][model2_sft.py][INFO] Epoch:[0/2](5450/63764) loss:3.798 lr:0.0000196 epoch_Time:214.0min: [2023-12-11 08:03:38,939][model2_sft.py][INFO] Epoch:[0/2](5500/63764) loss:3.847 lr:0.0000196 epoch_Time:214.0min: [2023-12-11 08:03:49,946][model2_sft.py][INFO] Epoch:[0/2](5550/63764) loss:3.092 lr:0.0000196 epoch_Time:214.0min: [2023-12-11 08:04:00,979][model2_sft.py][INFO] Epoch:[0/2](5600/63764) loss:3.494 lr:0.0000196 epoch_Time:214.0min: [2023-12-11 08:04:12,004][model2_sft.py][INFO] Epoch:[0/2](5650/63764) loss:3.898 lr:0.0000196 epoch_Time:214.0min: [2023-12-11 08:04:23,018][model2_sft.py][INFO] Epoch:[0/2](5700/63764) loss:3.140 lr:0.0000196 epoch_Time:214.0min: [2023-12-11 08:04:34,017][model2_sft.py][INFO] Epoch:[0/2](5750/63764) loss:3.975 lr:0.0000196 epoch_Time:213.0min: [2023-12-11 08:04:45,062][model2_sft.py][INFO] Epoch:[0/2](5800/63764) loss:3.709 lr:0.0000196 epoch_Time:213.0min: [2023-12-11 08:04:56,059][model2_sft.py][INFO] Epoch:[0/2](5850/63764) loss:3.856 lr:0.0000195 epoch_Time:213.0min: [2023-12-11 08:05:07,042][model2_sft.py][INFO] Epoch:[0/2](5900/63764) loss:2.819 lr:0.0000195 epoch_Time:213.0min: [2023-12-11 08:05:18,070][model2_sft.py][INFO] Epoch:[0/2](5950/63764) loss:3.052 lr:0.0000195 epoch_Time:213.0min: [2023-12-11 08:05:29,112][model2_sft.py][INFO] Epoch:[0/2](6000/63764) loss:3.364 lr:0.0000195 epoch_Time:212.0min: [2023-12-11 08:05:40,119][model2_sft.py][INFO] Epoch:[0/2](6050/63764) loss:3.392 lr:0.0000195 epoch_Time:212.0min: [2023-12-11 08:05:51,138][model2_sft.py][INFO] Epoch:[0/2](6100/63764) loss:3.359 lr:0.0000195 epoch_Time:212.0min: [2023-12-11 08:06:02,179][model2_sft.py][INFO] Epoch:[0/2](6150/63764) loss:3.459 lr:0.0000195 epoch_Time:212.0min: [2023-12-11 08:06:13,256][model2_sft.py][INFO] Epoch:[0/2](6200/63764) loss:3.085 lr:0.0000195 epoch_Time:212.0min: [2023-12-11 08:06:24,276][model2_sft.py][INFO] Epoch:[0/2](6250/63764) loss:3.866 lr:0.0000195 epoch_Time:212.0min: [2023-12-11 08:06:35,306][model2_sft.py][INFO] Epoch:[0/2](6300/63764) loss:3.505 lr:0.0000195 epoch_Time:211.0min: [2023-12-11 08:06:46,320][model2_sft.py][INFO] Epoch:[0/2](6350/63764) loss:3.846 lr:0.0000194 epoch_Time:211.0min: [2023-12-11 08:06:57,388][model2_sft.py][INFO] Epoch:[0/2](6400/63764) loss:3.499 lr:0.0000194 epoch_Time:211.0min: [2023-12-11 08:07:08,433][model2_sft.py][INFO] Epoch:[0/2](6450/63764) loss:3.518 lr:0.0000194 epoch_Time:211.0min: [2023-12-11 08:07:19,466][model2_sft.py][INFO] Epoch:[0/2](6500/63764) loss:3.002 lr:0.0000194 epoch_Time:211.0min: [2023-12-11 08:07:30,547][model2_sft.py][INFO] Epoch:[0/2](6550/63764) loss:3.725 lr:0.0000194 epoch_Time:210.0min: [2023-12-11 08:07:41,553][model2_sft.py][INFO] Epoch:[0/2](6600/63764) loss:4.088 lr:0.0000194 epoch_Time:210.0min: [2023-12-11 08:07:52,557][model2_sft.py][INFO] Epoch:[0/2](6650/63764) loss:3.180 lr:0.0000194 epoch_Time:210.0min: [2023-12-11 08:08:03,618][model2_sft.py][INFO] Epoch:[0/2](6700/63764) loss:3.273 lr:0.0000194 epoch_Time:210.0min: [2023-12-11 08:08:14,648][model2_sft.py][INFO] Epoch:[0/2](6750/63764) loss:3.672 lr:0.0000194 epoch_Time:210.0min: [2023-12-11 08:08:25,724][model2_sft.py][INFO] Epoch:[0/2](6800/63764) loss:4.262 lr:0.0000194 epoch_Time:209.0min: [2023-12-11 08:08:36,751][model2_sft.py][INFO] Epoch:[0/2](6850/63764) loss:3.918 lr:0.0000193 epoch_Time:209.0min: [2023-12-11 08:08:47,793][model2_sft.py][INFO] Epoch:[0/2](6900/63764) loss:3.724 lr:0.0000193 epoch_Time:209.0min: [2023-12-11 08:08:58,804][model2_sft.py][INFO] Epoch:[0/2](6950/63764) loss:3.718 lr:0.0000193 epoch_Time:209.0min: [2023-12-11 08:09:09,905][model2_sft.py][INFO] Epoch:[0/2](7000/63764) loss:2.963 lr:0.0000193 epoch_Time:209.0min: [2023-12-11 08:09:20,966][model2_sft.py][INFO] Epoch:[0/2](7050/63764) loss:3.861 lr:0.0000193 epoch_Time:209.0min: [2023-12-11 08:09:31,971][model2_sft.py][INFO] Epoch:[0/2](7100/63764) loss:3.767 lr:0.0000193 epoch_Time:208.0min: [2023-12-11 08:09:43,100][model2_sft.py][INFO] Epoch:[0/2](7150/63764) loss:3.386 lr:0.0000193 epoch_Time:208.0min: [2023-12-11 08:09:54,186][model2_sft.py][INFO] Epoch:[0/2](7200/63764) loss:3.612 lr:0.0000193 epoch_Time:208.0min: [2023-12-11 08:10:05,271][model2_sft.py][INFO] Epoch:[0/2](7250/63764) loss:3.106 lr:0.0000192 epoch_Time:208.0min: [2023-12-11 08:10:16,289][model2_sft.py][INFO] Epoch:[0/2](7300/63764) loss:2.710 lr:0.0000192 epoch_Time:208.0min: [2023-12-11 08:10:27,306][model2_sft.py][INFO] Epoch:[0/2](7350/63764) loss:3.889 lr:0.0000192 epoch_Time:207.0min: [2023-12-11 08:10:38,306][model2_sft.py][INFO] Epoch:[0/2](7400/63764) loss:3.150 lr:0.0000192 epoch_Time:207.0min: [2023-12-11 08:10:49,316][model2_sft.py][INFO] Epoch:[0/2](7450/63764) loss:3.946 lr:0.0000192 epoch_Time:207.0min: [2023-12-11 08:11:00,377][model2_sft.py][INFO] Epoch:[0/2](7500/63764) loss:3.336 lr:0.0000192 epoch_Time:207.0min: [2023-12-11 08:11:11,415][model2_sft.py][INFO] Epoch:[0/2](7550/63764) loss:3.295 lr:0.0000192 epoch_Time:207.0min: [2023-12-11 08:11:22,424][model2_sft.py][INFO] Epoch:[0/2](7600/63764) loss:3.467 lr:0.0000192 epoch_Time:207.0min: [2023-12-11 08:11:33,421][model2_sft.py][INFO] Epoch:[0/2](7650/63764) loss:3.231 lr:0.0000191 epoch_Time:206.0min: [2023-12-11 08:11:44,466][model2_sft.py][INFO] Epoch:[0/2](7700/63764) loss:3.182 lr:0.0000191 epoch_Time:206.0min: [2023-12-11 08:11:55,574][model2_sft.py][INFO] Epoch:[0/2](7750/63764) loss:3.380 lr:0.0000191 epoch_Time:206.0min: [2023-12-11 08:12:06,693][model2_sft.py][INFO] Epoch:[0/2](7800/63764) loss:4.183 lr:0.0000191 epoch_Time:206.0min: [2023-12-11 08:12:17,749][model2_sft.py][INFO] Epoch:[0/2](7850/63764) loss:2.909 lr:0.0000191 epoch_Time:206.0min: [2023-12-11 08:12:28,811][model2_sft.py][INFO] Epoch:[0/2](7900/63764) loss:3.527 lr:0.0000191 epoch_Time:205.0min: [2023-12-11 08:12:39,883][model2_sft.py][INFO] Epoch:[0/2](7950/63764) loss:3.164 lr:0.0000191 epoch_Time:205.0min: [2023-12-11 08:12:50,986][model2_sft.py][INFO] Epoch:[0/2](8000/63764) loss:3.895 lr:0.0000191 epoch_Time:205.0min: [2023-12-11 08:13:02,076][model2_sft.py][INFO] Epoch:[0/2](8050/63764) loss:3.396 lr:0.0000190 epoch_Time:205.0min: [2023-12-11 08:13:13,144][model2_sft.py][INFO] Epoch:[0/2](8100/63764) loss:3.879 lr:0.0000190 epoch_Time:205.0min: [2023-12-11 08:13:24,165][model2_sft.py][INFO] Epoch:[0/2](8150/63764) loss:2.996 lr:0.0000190 epoch_Time:205.0min: [2023-12-11 08:13:35,181][model2_sft.py][INFO] Epoch:[0/2](8200/63764) loss:3.857 lr:0.0000190 epoch_Time:204.0min: [2023-12-11 08:13:46,212][model2_sft.py][INFO] Epoch:[0/2](8250/63764) loss:3.528 lr:0.0000190 epoch_Time:204.0min: [2023-12-11 08:13:57,246][model2_sft.py][INFO] Epoch:[0/2](8300/63764) loss:2.898 lr:0.0000190 epoch_Time:204.0min: [2023-12-11 08:14:08,256][model2_sft.py][INFO] Epoch:[0/2](8350/63764) loss:4.492 lr:0.0000190 epoch_Time:204.0min: [2023-12-11 08:14:19,290][model2_sft.py][INFO] Epoch:[0/2](8400/63764) loss:3.780 lr:0.0000190 epoch_Time:204.0min: [2023-12-11 08:14:30,341][model2_sft.py][INFO] Epoch:[0/2](8450/63764) loss:4.153 lr:0.0000189 epoch_Time:203.0min: [2023-12-11 08:14:41,387][model2_sft.py][INFO] Epoch:[0/2](8500/63764) loss:3.634 lr:0.0000189 epoch_Time:203.0min: [2023-12-11 08:14:52,458][model2_sft.py][INFO] Epoch:[0/2](8550/63764) loss:3.286 lr:0.0000189 epoch_Time:203.0min: [2023-12-11 08:15:03,495][model2_sft.py][INFO] Epoch:[0/2](8600/63764) loss:3.419 lr:0.0000189 epoch_Time:203.0min: [2023-12-11 08:15:14,505][model2_sft.py][INFO] Epoch:[0/2](8650/63764) loss:4.104 lr:0.0000189 epoch_Time:203.0min: [2023-12-11 08:15:25,528][model2_sft.py][INFO] Epoch:[0/2](8700/63764) loss:3.828 lr:0.0000189 epoch_Time:202.0min: [2023-12-11 08:15:36,570][model2_sft.py][INFO] Epoch:[0/2](8750/63764) loss:3.259 lr:0.0000189 epoch_Time:202.0min: [2023-12-11 08:15:47,622][model2_sft.py][INFO] Epoch:[0/2](8800/63764) loss:3.523 lr:0.0000188 epoch_Time:202.0min: [2023-12-11 08:15:58,617][model2_sft.py][INFO] Epoch:[0/2](8850/63764) loss:3.779 lr:0.0000188 epoch_Time:202.0min: [2023-12-11 08:16:09,626][model2_sft.py][INFO] Epoch:[0/2](8900/63764) loss:4.013 lr:0.0000188 epoch_Time:202.0min: [2023-12-11 08:16:20,655][model2_sft.py][INFO] Epoch:[0/2](8950/63764) loss:3.744 lr:0.0000188 epoch_Time:202.0min: [2023-12-11 08:16:31,682][model2_sft.py][INFO] Epoch:[0/2](9000/63764) loss:3.178 lr:0.0000188 epoch_Time:201.0min: [2023-12-11 08:16:42,706][model2_sft.py][INFO] Epoch:[0/2](9050/63764) loss:3.334 lr:0.0000188 epoch_Time:201.0min: [2023-12-11 08:16:53,708][model2_sft.py][INFO] Epoch:[0/2](9100/63764) loss:3.504 lr:0.0000187 epoch_Time:201.0min: [2023-12-11 08:17:04,734][model2_sft.py][INFO] Epoch:[0/2](9150/63764) loss:3.873 lr:0.0000187 epoch_Time:201.0min: [2023-12-11 08:17:15,768][model2_sft.py][INFO] Epoch:[0/2](9200/63764) loss:4.109 lr:0.0000187 epoch_Time:201.0min: [2023-12-11 08:17:26,813][model2_sft.py][INFO] Epoch:[0/2](9250/63764) loss:3.407 lr:0.0000187 epoch_Time:200.0min: [2023-12-11 08:17:37,820][model2_sft.py][INFO] Epoch:[0/2](9300/63764) loss:3.784 lr:0.0000187 epoch_Time:200.0min: [2023-12-11 08:17:48,886][model2_sft.py][INFO] Epoch:[0/2](9350/63764) loss:3.571 lr:0.0000187 epoch_Time:200.0min: [2023-12-11 08:17:59,963][model2_sft.py][INFO] Epoch:[0/2](9400/63764) loss:3.949 lr:0.0000187 epoch_Time:200.0min: [2023-12-11 08:18:11,120][model2_sft.py][INFO] Epoch:[0/2](9450/63764) loss:3.710 lr:0.0000186 epoch_Time:200.0min: [2023-12-11 08:18:22,194][model2_sft.py][INFO] Epoch:[0/2](9500/63764) loss:3.709 lr:0.0000186 epoch_Time:200.0min: [2023-12-11 08:18:33,284][model2_sft.py][INFO] Epoch:[0/2](9550/63764) loss:3.816 lr:0.0000186 epoch_Time:199.0min: [2023-12-11 08:18:44,384][model2_sft.py][INFO] Epoch:[0/2](9600/63764) loss:3.214 lr:0.0000186 epoch_Time:199.0min: [2023-12-11 08:18:55,394][model2_sft.py][INFO] Epoch:[0/2](9650/63764) loss:3.788 lr:0.0000186 epoch_Time:199.0min: [2023-12-11 08:19:06,458][model2_sft.py][INFO] Epoch:[0/2](9700/63764) loss:3.359 lr:0.0000186 epoch_Time:199.0min: [2023-12-11 08:19:17,496][model2_sft.py][INFO] Epoch:[0/2](9750/63764) loss:3.360 lr:0.0000185 epoch_Time:199.0min: [2023-12-11 08:19:28,557][model2_sft.py][INFO] Epoch:[0/2](9800/63764) loss:3.301 lr:0.0000185 epoch_Time:198.0min: [2023-12-11 08:19:39,577][model2_sft.py][INFO] Epoch:[0/2](9850/63764) loss:3.349 lr:0.0000185 epoch_Time:198.0min: [2023-12-11 08:19:50,573][model2_sft.py][INFO] Epoch:[0/2](9900/63764) loss:3.105 lr:0.0000185 epoch_Time:198.0min: [2023-12-11 08:20:01,575][model2_sft.py][INFO] Epoch:[0/2](9950/63764) loss:3.255 lr:0.0000185 epoch_Time:198.0min: [2023-12-11 08:20:12,603][model2_sft.py][INFO] Epoch:[0/2](10000/63764) loss:3.157 lr:0.0000185 epoch_Time:198.0min: [2023-12-11 08:20:23,606][model2_sft.py][INFO] Epoch:[0/2](10050/63764) loss:3.094 lr:0.0000184 epoch_Time:198.0min: [2023-12-11 08:20:34,579][model2_sft.py][INFO] Epoch:[0/2](10100/63764) loss:3.503 lr:0.0000184 epoch_Time:197.0min: [2023-12-11 08:20:45,600][model2_sft.py][INFO] Epoch:[0/2](10150/63764) loss:3.674 lr:0.0000184 epoch_Time:197.0min: [2023-12-11 08:20:56,675][model2_sft.py][INFO] Epoch:[0/2](10200/63764) loss:3.251 lr:0.0000184 epoch_Time:197.0min: [2023-12-11 08:21:07,807][model2_sft.py][INFO] Epoch:[0/2](10250/63764) loss:3.557 lr:0.0000184 epoch_Time:197.0min: [2023-12-11 08:21:19,006][model2_sft.py][INFO] Epoch:[0/2](10300/63764) loss:3.272 lr:0.0000184 epoch_Time:197.0min: [2023-12-11 08:21:30,207][model2_sft.py][INFO] Epoch:[0/2](10350/63764) loss:3.730 lr:0.0000183 epoch_Time:196.0min: [2023-12-11 08:21:41,252][model2_sft.py][INFO] Epoch:[0/2](10400/63764) loss:3.887 lr:0.0000183 epoch_Time:196.0min: [2023-12-11 08:21:52,372][model2_sft.py][INFO] Epoch:[0/2](10450/63764) loss:3.904 lr:0.0000183 epoch_Time:196.0min: [2023-12-11 08:22:03,490][model2_sft.py][INFO] Epoch:[0/2](10500/63764) loss:3.471 lr:0.0000183 epoch_Time:196.0min: [2023-12-11 08:22:14,518][model2_sft.py][INFO] Epoch:[0/2](10550/63764) loss:3.716 lr:0.0000183 epoch_Time:196.0min: [2023-12-11 08:22:25,579][model2_sft.py][INFO] Epoch:[0/2](10600/63764) loss:3.788 lr:0.0000183 epoch_Time:195.0min: [2023-12-11 08:22:36,629][model2_sft.py][INFO] Epoch:[0/2](10650/63764) loss:3.389 lr:0.0000182 epoch_Time:195.0min: [2023-12-11 08:22:47,712][model2_sft.py][INFO] Epoch:[0/2](10700/63764) loss:3.778 lr:0.0000182 epoch_Time:195.0min: [2023-12-11 08:22:58,772][model2_sft.py][INFO] Epoch:[0/2](10750/63764) loss:3.211 lr:0.0000182 epoch_Time:195.0min: [2023-12-11 08:23:09,867][model2_sft.py][INFO] Epoch:[0/2](10800/63764) loss:3.583 lr:0.0000182 epoch_Time:195.0min: [2023-12-11 08:23:21,007][model2_sft.py][INFO] Epoch:[0/2](10850/63764) loss:3.384 lr:0.0000182 epoch_Time:195.0min: [2023-12-11 08:23:32,093][model2_sft.py][INFO] Epoch:[0/2](10900/63764) loss:3.414 lr:0.0000181 epoch_Time:194.0min: [2023-12-11 08:23:43,198][model2_sft.py][INFO] Epoch:[0/2](10950/63764) loss:4.021 lr:0.0000181 epoch_Time:194.0min: [2023-12-11 08:23:54,251][model2_sft.py][INFO] Epoch:[0/2](11000/63764) loss:3.333 lr:0.0000181 epoch_Time:194.0min: [2023-12-11 08:24:05,330][model2_sft.py][INFO] Epoch:[0/2](11050/63764) loss:3.943 lr:0.0000181 epoch_Time:194.0min: [2023-12-11 08:24:16,374][model2_sft.py][INFO] Epoch:[0/2](11100/63764) loss:3.739 lr:0.0000181 epoch_Time:194.0min: [2023-12-11 08:24:27,420][model2_sft.py][INFO] Epoch:[0/2](11150/63764) loss:2.766 lr:0.0000181 epoch_Time:193.0min: [2023-12-11 08:24:38,471][model2_sft.py][INFO] Epoch:[0/2](11200/63764) loss:3.342 lr:0.0000180 epoch_Time:193.0min: [2023-12-11 08:24:49,538][model2_sft.py][INFO] Epoch:[0/2](11250/63764) loss:4.700 lr:0.0000180 epoch_Time:193.0min: [2023-12-11 08:25:00,591][model2_sft.py][INFO] Epoch:[0/2](11300/63764) loss:3.908 lr:0.0000180 epoch_Time:193.0min: [2023-12-11 08:25:11,626][model2_sft.py][INFO] Epoch:[0/2](11350/63764) loss:3.906 lr:0.0000180 epoch_Time:193.0min: [2023-12-11 08:25:22,656][model2_sft.py][INFO] Epoch:[0/2](11400/63764) loss:3.429 lr:0.0000180 epoch_Time:193.0min: [2023-12-11 08:25:33,706][model2_sft.py][INFO] Epoch:[0/2](11450/63764) loss:3.455 lr:0.0000179 epoch_Time:192.0min: [2023-12-11 08:25:44,752][model2_sft.py][INFO] Epoch:[0/2](11500/63764) loss:3.792 lr:0.0000179 epoch_Time:192.0min: [2023-12-11 08:25:55,787][model2_sft.py][INFO] Epoch:[0/2](11550/63764) loss:3.650 lr:0.0000179 epoch_Time:192.0min: [2023-12-11 08:26:06,894][model2_sft.py][INFO] Epoch:[0/2](11600/63764) loss:3.386 lr:0.0000179 epoch_Time:192.0min: [2023-12-11 08:26:17,928][model2_sft.py][INFO] Epoch:[0/2](11650/63764) loss:3.960 lr:0.0000179 epoch_Time:192.0min: [2023-12-11 08:26:28,953][model2_sft.py][INFO] Epoch:[0/2](11700/63764) loss:3.478 lr:0.0000179 epoch_Time:191.0min: [2023-12-11 08:26:40,023][model2_sft.py][INFO] Epoch:[0/2](11750/63764) loss:3.786 lr:0.0000178 epoch_Time:191.0min: [2023-12-11 08:26:51,064][model2_sft.py][INFO] Epoch:[0/2](11800/63764) loss:3.803 lr:0.0000178 epoch_Time:191.0min: [2023-12-11 08:27:02,161][model2_sft.py][INFO] Epoch:[0/2](11850/63764) loss:3.482 lr:0.0000178 epoch_Time:191.0min: [2023-12-11 08:27:13,173][model2_sft.py][INFO] Epoch:[0/2](11900/63764) loss:3.368 lr:0.0000178 epoch_Time:191.0min: [2023-12-11 08:27:24,203][model2_sft.py][INFO] Epoch:[0/2](11950/63764) loss:2.985 lr:0.0000178 epoch_Time:191.0min: [2023-12-11 08:27:35,243][model2_sft.py][INFO] Epoch:[0/2](12000/63764) loss:3.802 lr:0.0000177 epoch_Time:190.0min: [2023-12-11 08:27:46,279][model2_sft.py][INFO] Epoch:[0/2](12050/63764) loss:4.333 lr:0.0000177 epoch_Time:190.0min: [2023-12-11 08:27:57,344][model2_sft.py][INFO] Epoch:[0/2](12100/63764) loss:3.950 lr:0.0000177 epoch_Time:190.0min: [2023-12-11 08:28:08,359][model2_sft.py][INFO] Epoch:[0/2](12150/63764) loss:3.495 lr:0.0000177 epoch_Time:190.0min: [2023-12-11 08:28:19,395][model2_sft.py][INFO] Epoch:[0/2](12200/63764) loss:3.471 lr:0.0000177 epoch_Time:190.0min: [2023-12-11 08:28:30,429][model2_sft.py][INFO] Epoch:[0/2](12250/63764) loss:3.483 lr:0.0000176 epoch_Time:189.0min: [2023-12-11 08:28:41,460][model2_sft.py][INFO] Epoch:[0/2](12300/63764) loss:3.688 lr:0.0000176 epoch_Time:189.0min: [2023-12-11 08:28:52,506][model2_sft.py][INFO] Epoch:[0/2](12350/63764) loss:3.941 lr:0.0000176 epoch_Time:189.0min: [2023-12-11 08:29:03,566][model2_sft.py][INFO] Epoch:[0/2](12400/63764) loss:3.490 lr:0.0000176 epoch_Time:189.0min: [2023-12-11 08:29:14,629][model2_sft.py][INFO] Epoch:[0/2](12450/63764) loss:3.654 lr:0.0000176 epoch_Time:189.0min: [2023-12-11 08:29:25,769][model2_sft.py][INFO] Epoch:[0/2](12500/63764) loss:3.306 lr:0.0000175 epoch_Time:188.0min: [2023-12-11 08:29:36,821][model2_sft.py][INFO] Epoch:[0/2](12550/63764) loss:4.260 lr:0.0000175 epoch_Time:188.0min: [2023-12-11 08:29:47,857][model2_sft.py][INFO] Epoch:[0/2](12600/63764) loss:3.673 lr:0.0000175 epoch_Time:188.0min: [2023-12-11 08:29:58,897][model2_sft.py][INFO] Epoch:[0/2](12650/63764) loss:3.266 lr:0.0000175 epoch_Time:188.0min: [2023-12-11 08:30:09,907][model2_sft.py][INFO] Epoch:[0/2](12700/63764) loss:3.304 lr:0.0000175 epoch_Time:188.0min: [2023-12-11 08:30:20,936][model2_sft.py][INFO] Epoch:[0/2](12750/63764) loss:4.056 lr:0.0000174 epoch_Time:188.0min: [2023-12-11 08:30:32,014][model2_sft.py][INFO] Epoch:[0/2](12800/63764) loss:4.005 lr:0.0000174 epoch_Time:187.0min: [2023-12-11 08:30:43,102][model2_sft.py][INFO] Epoch:[0/2](12850/63764) loss:3.558 lr:0.0000174 epoch_Time:187.0min: [2023-12-11 08:30:54,177][model2_sft.py][INFO] Epoch:[0/2](12900/63764) loss:4.109 lr:0.0000174 epoch_Time:187.0min: [2023-12-11 08:31:05,258][model2_sft.py][INFO] Epoch:[0/2](12950/63764) loss:3.507 lr:0.0000173 epoch_Time:187.0min: [2023-12-11 08:31:16,345][model2_sft.py][INFO] Epoch:[0/2](13000/63764) loss:3.430 lr:0.0000173 epoch_Time:187.0min: [2023-12-11 08:31:27,390][model2_sft.py][INFO] Epoch:[0/2](13050/63764) loss:3.920 lr:0.0000173 epoch_Time:186.0min: [2023-12-11 08:31:38,414][model2_sft.py][INFO] Epoch:[0/2](13100/63764) loss:3.909 lr:0.0000173 epoch_Time:186.0min: [2023-12-11 08:31:49,482][model2_sft.py][INFO] Epoch:[0/2](13150/63764) loss:3.715 lr:0.0000173 epoch_Time:186.0min: [2023-12-11 08:32:00,528][model2_sft.py][INFO] Epoch:[0/2](13200/63764) loss:3.197 lr:0.0000172 epoch_Time:186.0min: [2023-12-11 08:32:11,587][model2_sft.py][INFO] Epoch:[0/2](13250/63764) loss:3.459 lr:0.0000172 epoch_Time:186.0min: [2023-12-11 08:32:22,663][model2_sft.py][INFO] Epoch:[0/2](13300/63764) loss:3.725 lr:0.0000172 epoch_Time:186.0min: [2023-12-11 08:32:33,718][model2_sft.py][INFO] Epoch:[0/2](13350/63764) loss:4.300 lr:0.0000172 epoch_Time:185.0min: [2023-12-11 08:32:44,759][model2_sft.py][INFO] Epoch:[0/2](13400/63764) loss:3.771 lr:0.0000172 epoch_Time:185.0min: [2023-12-11 08:32:55,786][model2_sft.py][INFO] Epoch:[0/2](13450/63764) loss:3.614 lr:0.0000171 epoch_Time:185.0min: [2023-12-11 08:33:06,813][model2_sft.py][INFO] Epoch:[0/2](13500/63764) loss:3.679 lr:0.0000171 epoch_Time:185.0min: [2023-12-11 08:33:17,867][model2_sft.py][INFO] Epoch:[0/2](13550/63764) loss:3.749 lr:0.0000171 epoch_Time:185.0min: [2023-12-11 08:33:28,893][model2_sft.py][INFO] Epoch:[0/2](13600/63764) loss:3.352 lr:0.0000171 epoch_Time:184.0min: [2023-12-11 08:33:39,907][model2_sft.py][INFO] Epoch:[0/2](13650/63764) loss:3.647 lr:0.0000170 epoch_Time:184.0min: [2023-12-11 08:33:50,913][model2_sft.py][INFO] Epoch:[0/2](13700/63764) loss:3.560 lr:0.0000170 epoch_Time:184.0min: [2023-12-11 08:34:01,926][model2_sft.py][INFO] Epoch:[0/2](13750/63764) loss:3.347 lr:0.0000170 epoch_Time:184.0min: [2023-12-11 08:34:13,001][model2_sft.py][INFO] Epoch:[0/2](13800/63764) loss:3.670 lr:0.0000170 epoch_Time:184.0min: [2023-12-11 08:34:24,049][model2_sft.py][INFO] Epoch:[0/2](13850/63764) loss:2.903 lr:0.0000170 epoch_Time:184.0min: [2023-12-11 08:34:35,081][model2_sft.py][INFO] Epoch:[0/2](13900/63764) loss:4.385 lr:0.0000169 epoch_Time:183.0min: [2023-12-11 08:34:46,110][model2_sft.py][INFO] Epoch:[0/2](13950/63764) loss:3.931 lr:0.0000169 epoch_Time:183.0min: [2023-12-11 08:34:57,147][model2_sft.py][INFO] Epoch:[0/2](14000/63764) loss:3.781 lr:0.0000169 epoch_Time:183.0min: [2023-12-11 08:35:08,157][model2_sft.py][INFO] Epoch:[0/2](14050/63764) loss:3.741 lr:0.0000169 epoch_Time:183.0min: [2023-12-11 08:35:19,210][model2_sft.py][INFO] Epoch:[0/2](14100/63764) loss:4.365 lr:0.0000168 epoch_Time:183.0min: [2023-12-11 08:35:30,254][model2_sft.py][INFO] Epoch:[0/2](14150/63764) loss:4.205 lr:0.0000168 epoch_Time:182.0min: [2023-12-11 08:35:41,317][model2_sft.py][INFO] Epoch:[0/2](14200/63764) loss:4.110 lr:0.0000168 epoch_Time:182.0min: [2023-12-11 08:35:52,303][model2_sft.py][INFO] Epoch:[0/2](14250/63764) loss:3.478 lr:0.0000168 epoch_Time:182.0min: [2023-12-11 08:36:03,320][model2_sft.py][INFO] Epoch:[0/2](14300/63764) loss:3.387 lr:0.0000168 epoch_Time:182.0min: [2023-12-11 08:36:14,468][model2_sft.py][INFO] Epoch:[0/2](14350/63764) loss:4.008 lr:0.0000167 epoch_Time:182.0min: [2023-12-11 08:36:25,544][model2_sft.py][INFO] Epoch:[0/2](14400/63764) loss:3.178 lr:0.0000167 epoch_Time:181.0min: [2023-12-11 08:36:36,617][model2_sft.py][INFO] Epoch:[0/2](14450/63764) loss:3.475 lr:0.0000167 epoch_Time:181.0min: [2023-12-11 08:36:47,656][model2_sft.py][INFO] Epoch:[0/2](14500/63764) loss:3.910 lr:0.0000167 epoch_Time:181.0min: [2023-12-11 08:36:58,697][model2_sft.py][INFO] Epoch:[0/2](14550/63764) loss:3.868 lr:0.0000166 epoch_Time:181.0min: [2023-12-11 08:37:09,743][model2_sft.py][INFO] Epoch:[0/2](14600/63764) loss:3.349 lr:0.0000166 epoch_Time:181.0min: [2023-12-11 08:37:20,806][model2_sft.py][INFO] Epoch:[0/2](14650/63764) loss:3.609 lr:0.0000166 epoch_Time:181.0min: [2023-12-11 08:37:31,848][model2_sft.py][INFO] Epoch:[0/2](14700/63764) loss:3.616 lr:0.0000166 epoch_Time:180.0min: [2023-12-11 08:37:42,852][model2_sft.py][INFO] Epoch:[0/2](14750/63764) loss:3.397 lr:0.0000165 epoch_Time:180.0min: [2023-12-11 08:37:53,869][model2_sft.py][INFO] Epoch:[0/2](14800/63764) loss:3.635 lr:0.0000165 epoch_Time:180.0min: [2023-12-11 08:38:04,941][model2_sft.py][INFO] Epoch:[0/2](14850/63764) loss:3.490 lr:0.0000165 epoch_Time:180.0min: [2023-12-11 08:38:15,948][model2_sft.py][INFO] Epoch:[0/2](14900/63764) loss:2.769 lr:0.0000165 epoch_Time:180.0min: [2023-12-11 08:38:26,990][model2_sft.py][INFO] Epoch:[0/2](14950/63764) loss:3.870 lr:0.0000164 epoch_Time:179.0min: [2023-12-11 08:38:38,066][model2_sft.py][INFO] Epoch:[0/2](15000/63764) loss:3.605 lr:0.0000164 epoch_Time:179.0min: [2023-12-11 08:38:49,111][model2_sft.py][INFO] Epoch:[0/2](15050/63764) loss:4.008 lr:0.0000164 epoch_Time:179.0min: [2023-12-11 08:39:00,165][model2_sft.py][INFO] Epoch:[0/2](15100/63764) loss:3.983 lr:0.0000164 epoch_Time:179.0min: [2023-12-11 08:39:11,175][model2_sft.py][INFO] Epoch:[0/2](15150/63764) loss:3.314 lr:0.0000164 epoch_Time:179.0min: [2023-12-11 08:39:22,209][model2_sft.py][INFO] Epoch:[0/2](15200/63764) loss:3.968 lr:0.0000163 epoch_Time:179.0min: [2023-12-11 08:39:33,315][model2_sft.py][INFO] Epoch:[0/2](15250/63764) loss:3.490 lr:0.0000163 epoch_Time:178.0min: [2023-12-11 08:39:44,361][model2_sft.py][INFO] Epoch:[0/2](15300/63764) loss:3.620 lr:0.0000163 epoch_Time:178.0min: [2023-12-11 08:39:55,489][model2_sft.py][INFO] Epoch:[0/2](15350/63764) loss:3.355 lr:0.0000163 epoch_Time:178.0min: [2023-12-11 08:40:06,565][model2_sft.py][INFO] Epoch:[0/2](15400/63764) loss:3.741 lr:0.0000162 epoch_Time:178.0min: [2023-12-11 08:40:17,576][model2_sft.py][INFO] Epoch:[0/2](15450/63764) loss:3.967 lr:0.0000162 epoch_Time:178.0min: [2023-12-11 08:40:28,664][model2_sft.py][INFO] Epoch:[0/2](15500/63764) loss:3.221 lr:0.0000162 epoch_Time:177.0min: [2023-12-11 08:40:39,731][model2_sft.py][INFO] Epoch:[0/2](15550/63764) loss:3.070 lr:0.0000162 epoch_Time:177.0min: [2023-12-11 08:40:50,774][model2_sft.py][INFO] Epoch:[0/2](15600/63764) loss:2.944 lr:0.0000161 epoch_Time:177.0min: [2023-12-11 08:41:01,790][model2_sft.py][INFO] Epoch:[0/2](15650/63764) loss:2.847 lr:0.0000161 epoch_Time:177.0min: [2023-12-11 08:41:12,845][model2_sft.py][INFO] Epoch:[0/2](15700/63764) loss:4.172 lr:0.0000161 epoch_Time:177.0min: [2023-12-11 08:41:23,914][model2_sft.py][INFO] Epoch:[0/2](15750/63764) loss:3.714 lr:0.0000161 epoch_Time:177.0min: [2023-12-11 08:41:35,020][model2_sft.py][INFO] Epoch:[0/2](15800/63764) loss:4.375 lr:0.0000160 epoch_Time:176.0min: [2023-12-11 08:41:46,051][model2_sft.py][INFO] Epoch:[0/2](15850/63764) loss:3.359 lr:0.0000160 epoch_Time:176.0min: [2023-12-11 08:41:57,113][model2_sft.py][INFO] Epoch:[0/2](15900/63764) loss:3.463 lr:0.0000160 epoch_Time:176.0min: [2023-12-11 08:42:08,207][model2_sft.py][INFO] Epoch:[0/2](15950/63764) loss:3.506 lr:0.0000160 epoch_Time:176.0min: [2023-12-11 08:42:19,265][model2_sft.py][INFO] Epoch:[0/2](16000/63764) loss:3.741 lr:0.0000159 epoch_Time:176.0min: [2023-12-11 08:42:30,390][model2_sft.py][INFO] Epoch:[0/2](16050/63764) loss:3.768 lr:0.0000159 epoch_Time:175.0min: [2023-12-11 08:42:41,501][model2_sft.py][INFO] Epoch:[0/2](16100/63764) loss:3.664 lr:0.0000159 epoch_Time:175.0min: [2023-12-11 08:42:52,629][model2_sft.py][INFO] Epoch:[0/2](16150/63764) loss:3.802 lr:0.0000159 epoch_Time:175.0min: [2023-12-11 08:43:03,778][model2_sft.py][INFO] Epoch:[0/2](16200/63764) loss:3.704 lr:0.0000158 epoch_Time:175.0min: [2023-12-11 08:43:14,854][model2_sft.py][INFO] Epoch:[0/2](16250/63764) loss:3.599 lr:0.0000158 epoch_Time:175.0min: [2023-12-11 08:43:25,927][model2_sft.py][INFO] Epoch:[0/2](16300/63764) loss:3.413 lr:0.0000158 epoch_Time:174.0min: [2023-12-11 08:43:37,069][model2_sft.py][INFO] Epoch:[0/2](16350/63764) loss:2.868 lr:0.0000158 epoch_Time:174.0min: [2023-12-11 08:43:48,145][model2_sft.py][INFO] Epoch:[0/2](16400/63764) loss:3.197 lr:0.0000157 epoch_Time:174.0min: [2023-12-11 08:43:59,189][model2_sft.py][INFO] Epoch:[0/2](16450/63764) loss:3.036 lr:0.0000157 epoch_Time:174.0min: [2023-12-11 08:44:10,248][model2_sft.py][INFO] Epoch:[0/2](16500/63764) loss:3.944 lr:0.0000157 epoch_Time:174.0min: [2023-12-11 08:44:21,297][model2_sft.py][INFO] Epoch:[0/2](16550/63764) loss:3.466 lr:0.0000157 epoch_Time:174.0min: [2023-12-11 08:44:32,350][model2_sft.py][INFO] Epoch:[0/2](16600/63764) loss:3.720 lr:0.0000156 epoch_Time:173.0min: [2023-12-11 08:44:43,356][model2_sft.py][INFO] Epoch:[0/2](16650/63764) loss:3.987 lr:0.0000156 epoch_Time:173.0min: [2023-12-11 08:44:54,410][model2_sft.py][INFO] Epoch:[0/2](16700/63764) loss:3.754 lr:0.0000156 epoch_Time:173.0min: [2023-12-11 08:45:05,479][model2_sft.py][INFO] Epoch:[0/2](16750/63764) loss:3.652 lr:0.0000156 epoch_Time:173.0min: [2023-12-11 08:45:16,497][model2_sft.py][INFO] Epoch:[0/2](16800/63764) loss:4.339 lr:0.0000155 epoch_Time:173.0min: [2023-12-11 08:45:27,561][model2_sft.py][INFO] Epoch:[0/2](16850/63764) loss:3.523 lr:0.0000155 epoch_Time:172.0min: [2023-12-11 08:45:38,708][model2_sft.py][INFO] Epoch:[0/2](16900/63764) loss:3.678 lr:0.0000155 epoch_Time:172.0min: [2023-12-11 08:45:49,875][model2_sft.py][INFO] Epoch:[0/2](16950/63764) loss:3.409 lr:0.0000155 epoch_Time:172.0min: [2023-12-11 08:46:00,997][model2_sft.py][INFO] Epoch:[0/2](17000/63764) loss:3.915 lr:0.0000154 epoch_Time:172.0min: [2023-12-11 08:46:12,156][model2_sft.py][INFO] Epoch:[0/2](17050/63764) loss:3.739 lr:0.0000154 epoch_Time:172.0min: [2023-12-11 08:46:23,344][model2_sft.py][INFO] Epoch:[0/2](17100/63764) loss:3.750 lr:0.0000154 epoch_Time:172.0min: [2023-12-11 08:46:34,421][model2_sft.py][INFO] Epoch:[0/2](17150/63764) loss:3.919 lr:0.0000153 epoch_Time:171.0min: [2023-12-11 08:46:45,540][model2_sft.py][INFO] Epoch:[0/2](17200/63764) loss:3.322 lr:0.0000153 epoch_Time:171.0min: [2023-12-11 08:46:56,595][model2_sft.py][INFO] Epoch:[0/2](17250/63764) loss:3.497 lr:0.0000153 epoch_Time:171.0min: [2023-12-11 08:47:07,701][model2_sft.py][INFO] Epoch:[0/2](17300/63764) loss:3.179 lr:0.0000153 epoch_Time:171.0min: [2023-12-11 08:47:18,927][model2_sft.py][INFO] Epoch:[0/2](17350/63764) loss:3.616 lr:0.0000152 epoch_Time:171.0min: [2023-12-11 08:47:30,022][model2_sft.py][INFO] Epoch:[0/2](17400/63764) loss:3.389 lr:0.0000152 epoch_Time:170.0min: [2023-12-11 08:47:41,130][model2_sft.py][INFO] Epoch:[0/2](17450/63764) loss:3.626 lr:0.0000152 epoch_Time:170.0min: [2023-12-11 08:47:52,219][model2_sft.py][INFO] Epoch:[0/2](17500/63764) loss:3.430 lr:0.0000152 epoch_Time:170.0min: [2023-12-11 08:48:03,295][model2_sft.py][INFO] Epoch:[0/2](17550/63764) loss:3.053 lr:0.0000151 epoch_Time:170.0min: [2023-12-11 08:48:14,390][model2_sft.py][INFO] Epoch:[0/2](17600/63764) loss:3.185 lr:0.0000151 epoch_Time:170.0min: [2023-12-11 08:48:25,463][model2_sft.py][INFO] Epoch:[0/2](17650/63764) loss:3.517 lr:0.0000151 epoch_Time:169.0min: [2023-12-11 08:48:36,545][model2_sft.py][INFO] Epoch:[0/2](17700/63764) loss:3.604 lr:0.0000151 epoch_Time:169.0min: [2023-12-11 08:48:47,607][model2_sft.py][INFO] Epoch:[0/2](17750/63764) loss:3.970 lr:0.0000150 epoch_Time:169.0min: [2023-12-11 08:48:58,671][model2_sft.py][INFO] Epoch:[0/2](17800/63764) loss:3.152 lr:0.0000150 epoch_Time:169.0min: [2023-12-11 08:49:09,702][model2_sft.py][INFO] Epoch:[0/2](17850/63764) loss:4.022 lr:0.0000150 epoch_Time:169.0min: [2023-12-11 08:49:20,762][model2_sft.py][INFO] Epoch:[0/2](17900/63764) loss:4.046 lr:0.0000149 epoch_Time:169.0min: [2023-12-11 08:49:31,874][model2_sft.py][INFO] Epoch:[0/2](17950/63764) loss:3.721 lr:0.0000149 epoch_Time:168.0min: [2023-12-11 08:49:42,935][model2_sft.py][INFO] Epoch:[0/2](18000/63764) loss:4.055 lr:0.0000149 epoch_Time:168.0min: [2023-12-11 08:49:53,978][model2_sft.py][INFO] Epoch:[0/2](18050/63764) loss:3.830 lr:0.0000149 epoch_Time:168.0min: [2023-12-11 08:50:05,026][model2_sft.py][INFO] Epoch:[0/2](18100/63764) loss:3.594 lr:0.0000148 epoch_Time:168.0min: [2023-12-11 08:50:16,047][model2_sft.py][INFO] Epoch:[0/2](18150/63764) loss:3.956 lr:0.0000148 epoch_Time:168.0min: [2023-12-11 08:50:27,098][model2_sft.py][INFO] Epoch:[0/2](18200/63764) loss:3.214 lr:0.0000148 epoch_Time:167.0min: [2023-12-11 08:50:38,134][model2_sft.py][INFO] Epoch:[0/2](18250/63764) loss:3.371 lr:0.0000148 epoch_Time:167.0min: [2023-12-11 08:50:49,194][model2_sft.py][INFO] Epoch:[0/2](18300/63764) loss:3.440 lr:0.0000147 epoch_Time:167.0min: [2023-12-11 08:51:00,261][model2_sft.py][INFO] Epoch:[0/2](18350/63764) loss:3.898 lr:0.0000147 epoch_Time:167.0min: [2023-12-11 08:51:11,371][model2_sft.py][INFO] Epoch:[0/2](18400/63764) loss:3.771 lr:0.0000147 epoch_Time:167.0min: [2023-12-11 08:51:22,432][model2_sft.py][INFO] Epoch:[0/2](18450/63764) loss:4.097 lr:0.0000146 epoch_Time:167.0min: [2023-12-11 08:51:33,444][model2_sft.py][INFO] Epoch:[0/2](18500/63764) loss:3.579 lr:0.0000146 epoch_Time:166.0min: [2023-12-11 08:51:44,465][model2_sft.py][INFO] Epoch:[0/2](18550/63764) loss:3.317 lr:0.0000146 epoch_Time:166.0min: [2023-12-11 08:51:55,476][model2_sft.py][INFO] Epoch:[0/2](18600/63764) loss:3.393 lr:0.0000146 epoch_Time:166.0min: [2023-12-11 08:52:06,559][model2_sft.py][INFO] Epoch:[0/2](18650/63764) loss:4.069 lr:0.0000145 epoch_Time:166.0min: [2023-12-11 08:52:17,571][model2_sft.py][INFO] Epoch:[0/2](18700/63764) loss:3.549 lr:0.0000145 epoch_Time:166.0min: [2023-12-11 08:52:28,598][model2_sft.py][INFO] Epoch:[0/2](18750/63764) loss:3.080 lr:0.0000145 epoch_Time:165.0min: [2023-12-11 08:52:39,626][model2_sft.py][INFO] Epoch:[0/2](18800/63764) loss:3.458 lr:0.0000145 epoch_Time:165.0min: [2023-12-11 08:52:50,694][model2_sft.py][INFO] Epoch:[0/2](18850/63764) loss:3.575 lr:0.0000144 epoch_Time:165.0min: [2023-12-11 08:53:01,742][model2_sft.py][INFO] Epoch:[0/2](18900/63764) loss:3.329 lr:0.0000144 epoch_Time:165.0min: [2023-12-11 08:53:12,785][model2_sft.py][INFO] Epoch:[0/2](18950/63764) loss:3.156 lr:0.0000144 epoch_Time:165.0min: [2023-12-11 08:53:23,824][model2_sft.py][INFO] Epoch:[0/2](19000/63764) loss:3.065 lr:0.0000143 epoch_Time:165.0min: [2023-12-11 08:53:34,848][model2_sft.py][INFO] Epoch:[0/2](19050/63764) loss:3.333 lr:0.0000143 epoch_Time:164.0min: [2023-12-11 08:53:45,944][model2_sft.py][INFO] Epoch:[0/2](19100/63764) loss:3.872 lr:0.0000143 epoch_Time:164.0min: [2023-12-11 08:53:56,983][model2_sft.py][INFO] Epoch:[0/2](19150/63764) loss:3.333 lr:0.0000143 epoch_Time:164.0min: [2023-12-11 08:54:08,046][model2_sft.py][INFO] Epoch:[0/2](19200/63764) loss:3.710 lr:0.0000142 epoch_Time:164.0min: [2023-12-11 08:54:19,071][model2_sft.py][INFO] Epoch:[0/2](19250/63764) loss:3.586 lr:0.0000142 epoch_Time:164.0min: [2023-12-11 08:54:30,084][model2_sft.py][INFO] Epoch:[0/2](19300/63764) loss:3.810 lr:0.0000142 epoch_Time:163.0min: [2023-12-11 08:54:41,144][model2_sft.py][INFO] Epoch:[0/2](19350/63764) loss:3.473 lr:0.0000141 epoch_Time:163.0min: [2023-12-11 08:54:52,235][model2_sft.py][INFO] Epoch:[0/2](19400/63764) loss:3.879 lr:0.0000141 epoch_Time:163.0min: [2023-12-11 08:55:03,231][model2_sft.py][INFO] Epoch:[0/2](19450/63764) loss:3.741 lr:0.0000141 epoch_Time:163.0min: [2023-12-11 08:55:14,285][model2_sft.py][INFO] Epoch:[0/2](19500/63764) loss:3.502 lr:0.0000141 epoch_Time:163.0min: [2023-12-11 08:55:25,328][model2_sft.py][INFO] Epoch:[0/2](19550/63764) loss:4.091 lr:0.0000140 epoch_Time:162.0min: [2023-12-11 08:55:36,363][model2_sft.py][INFO] Epoch:[0/2](19600/63764) loss:3.836 lr:0.0000140 epoch_Time:162.0min: [2023-12-11 08:55:47,397][model2_sft.py][INFO] Epoch:[0/2](19650/63764) loss:3.493 lr:0.0000140 epoch_Time:162.0min: [2023-12-11 08:55:58,452][model2_sft.py][INFO] Epoch:[0/2](19700/63764) loss:3.400 lr:0.0000140 epoch_Time:162.0min: [2023-12-11 08:56:09,510][model2_sft.py][INFO] Epoch:[0/2](19750/63764) loss:3.398 lr:0.0000139 epoch_Time:162.0min: [2023-12-11 08:56:20,551][model2_sft.py][INFO] Epoch:[0/2](19800/63764) loss:4.048 lr:0.0000139 epoch_Time:162.0min: [2023-12-11 08:56:31,613][model2_sft.py][INFO] Epoch:[0/2](19850/63764) loss:3.874 lr:0.0000139 epoch_Time:161.0min: [2023-12-11 08:56:42,658][model2_sft.py][INFO] Epoch:[0/2](19900/63764) loss:3.882 lr:0.0000138 epoch_Time:161.0min: [2023-12-11 08:56:53,703][model2_sft.py][INFO] Epoch:[0/2](19950/63764) loss:3.378 lr:0.0000138 epoch_Time:161.0min: [2023-12-11 08:57:04,763][model2_sft.py][INFO] Epoch:[0/2](20000/63764) loss:3.519 lr:0.0000138 epoch_Time:161.0min: [2023-12-11 08:57:15,867][model2_sft.py][INFO] Epoch:[0/2](20050/63764) loss:3.899 lr:0.0000138 epoch_Time:161.0min: [2023-12-11 08:57:26,954][model2_sft.py][INFO] Epoch:[0/2](20100/63764) loss:3.311 lr:0.0000137 epoch_Time:160.0min: [2023-12-11 08:57:37,978][model2_sft.py][INFO] Epoch:[0/2](20150/63764) loss:3.847 lr:0.0000137 epoch_Time:160.0min: [2023-12-11 08:57:48,983][model2_sft.py][INFO] Epoch:[0/2](20200/63764) loss:3.818 lr:0.0000137 epoch_Time:160.0min: [2023-12-11 08:58:00,018][model2_sft.py][INFO] Epoch:[0/2](20250/63764) loss:3.965 lr:0.0000136 epoch_Time:160.0min: [2023-12-11 08:58:11,067][model2_sft.py][INFO] Epoch:[0/2](20300/63764) loss:3.904 lr:0.0000136 epoch_Time:160.0min: [2023-12-11 08:58:22,100][model2_sft.py][INFO] Epoch:[0/2](20350/63764) loss:3.917 lr:0.0000136 epoch_Time:160.0min: [2023-12-11 08:58:33,152][model2_sft.py][INFO] Epoch:[0/2](20400/63764) loss:4.037 lr:0.0000136 epoch_Time:159.0min: [2023-12-11 08:58:44,194][model2_sft.py][INFO] Epoch:[0/2](20450/63764) loss:4.493 lr:0.0000135 epoch_Time:159.0min: [2023-12-11 08:58:55,200][model2_sft.py][INFO] Epoch:[0/2](20500/63764) loss:4.108 lr:0.0000135 epoch_Time:159.0min: [2023-12-11 08:59:06,206][model2_sft.py][INFO] Epoch:[0/2](20550/63764) loss:3.491 lr:0.0000135 epoch_Time:159.0min: [2023-12-11 08:59:17,269][model2_sft.py][INFO] Epoch:[0/2](20600/63764) loss:3.298 lr:0.0000134 epoch_Time:159.0min: [2023-12-11 08:59:28,325][model2_sft.py][INFO] Epoch:[0/2](20650/63764) loss:3.769 lr:0.0000134 epoch_Time:158.0min: [2023-12-11 08:59:39,356][model2_sft.py][INFO] Epoch:[0/2](20700/63764) loss:3.595 lr:0.0000134 epoch_Time:158.0min: [2023-12-11 08:59:50,387][model2_sft.py][INFO] Epoch:[0/2](20750/63764) loss:3.770 lr:0.0000133 epoch_Time:158.0min: [2023-12-11 09:00:01,415][model2_sft.py][INFO] Epoch:[0/2](20800/63764) loss:3.593 lr:0.0000133 epoch_Time:158.0min: [2023-12-11 09:00:12,512][model2_sft.py][INFO] Epoch:[0/2](20850/63764) loss:3.897 lr:0.0000133 epoch_Time:158.0min: [2023-12-11 09:00:23,581][model2_sft.py][INFO] Epoch:[0/2](20900/63764) loss:3.967 lr:0.0000133 epoch_Time:158.0min: [2023-12-11 09:00:34,649][model2_sft.py][INFO] Epoch:[0/2](20950/63764) loss:3.584 lr:0.0000132 epoch_Time:157.0min: [2023-12-11 09:00:45,723][model2_sft.py][INFO] Epoch:[0/2](21000/63764) loss:3.591 lr:0.0000132 epoch_Time:157.0min: [2023-12-11 09:00:56,764][model2_sft.py][INFO] Epoch:[0/2](21050/63764) loss:3.256 lr:0.0000132 epoch_Time:157.0min: [2023-12-11 09:01:07,813][model2_sft.py][INFO] Epoch:[0/2](21100/63764) loss:3.300 lr:0.0000131 epoch_Time:157.0min: [2023-12-11 09:01:18,848][model2_sft.py][INFO] Epoch:[0/2](21150/63764) loss:3.569 lr:0.0000131 epoch_Time:157.0min: [2023-12-11 09:01:29,885][model2_sft.py][INFO] Epoch:[0/2](21200/63764) loss:3.177 lr:0.0000131 epoch_Time:156.0min: [2023-12-11 09:01:40,925][model2_sft.py][INFO] Epoch:[0/2](21250/63764) loss:3.501 lr:0.0000131 epoch_Time:156.0min: [2023-12-11 09:01:51,953][model2_sft.py][INFO] Epoch:[0/2](21300/63764) loss:4.233 lr:0.0000130 epoch_Time:156.0min: [2023-12-11 09:02:03,007][model2_sft.py][INFO] Epoch:[0/2](21350/63764) loss:3.502 lr:0.0000130 epoch_Time:156.0min: [2023-12-11 09:02:14,068][model2_sft.py][INFO] Epoch:[0/2](21400/63764) loss:3.248 lr:0.0000130 epoch_Time:156.0min: [2023-12-11 09:02:25,122][model2_sft.py][INFO] Epoch:[0/2](21450/63764) loss:3.614 lr:0.0000129 epoch_Time:156.0min: [2023-12-11 09:02:36,178][model2_sft.py][INFO] Epoch:[0/2](21500/63764) loss:3.419 lr:0.0000129 epoch_Time:155.0min: [2023-12-11 09:02:47,267][model2_sft.py][INFO] Epoch:[0/2](21550/63764) loss:3.822 lr:0.0000129 epoch_Time:155.0min: [2023-12-11 09:02:58,346][model2_sft.py][INFO] Epoch:[0/2](21600/63764) loss:3.595 lr:0.0000129 epoch_Time:155.0min: [2023-12-11 09:03:09,434][model2_sft.py][INFO] Epoch:[0/2](21650/63764) loss:3.679 lr:0.0000128 epoch_Time:155.0min: [2023-12-11 09:03:20,523][model2_sft.py][INFO] Epoch:[0/2](21700/63764) loss:3.963 lr:0.0000128 epoch_Time:155.0min: [2023-12-11 09:03:31,626][model2_sft.py][INFO] Epoch:[0/2](21750/63764) loss:3.081 lr:0.0000128 epoch_Time:154.0min: [2023-12-11 09:03:42,724][model2_sft.py][INFO] Epoch:[0/2](21800/63764) loss:4.121 lr:0.0000127 epoch_Time:154.0min: [2023-12-11 09:03:53,770][model2_sft.py][INFO] Epoch:[0/2](21850/63764) loss:3.384 lr:0.0000127 epoch_Time:154.0min: [2023-12-11 09:04:04,801][model2_sft.py][INFO] Epoch:[0/2](21900/63764) loss:4.258 lr:0.0000127 epoch_Time:154.0min: [2023-12-11 09:04:15,839][model2_sft.py][INFO] Epoch:[0/2](21950/63764) loss:3.621 lr:0.0000126 epoch_Time:154.0min: [2023-12-11 09:04:26,876][model2_sft.py][INFO] Epoch:[0/2](22000/63764) loss:3.395 lr:0.0000126 epoch_Time:153.0min: [2023-12-11 09:04:37,939][model2_sft.py][INFO] Epoch:[0/2](22050/63764) loss:3.482 lr:0.0000126 epoch_Time:153.0min: [2023-12-11 09:04:49,007][model2_sft.py][INFO] Epoch:[0/2](22100/63764) loss:3.670 lr:0.0000126 epoch_Time:153.0min: [2023-12-11 09:05:00,043][model2_sft.py][INFO] Epoch:[0/2](22150/63764) loss:2.806 lr:0.0000125 epoch_Time:153.0min: [2023-12-11 09:05:11,138][model2_sft.py][INFO] Epoch:[0/2](22200/63764) loss:3.504 lr:0.0000125 epoch_Time:153.0min: [2023-12-11 09:05:22,188][model2_sft.py][INFO] Epoch:[0/2](22250/63764) loss:3.443 lr:0.0000125 epoch_Time:153.0min: [2023-12-11 09:05:33,249][model2_sft.py][INFO] Epoch:[0/2](22300/63764) loss:3.403 lr:0.0000124 epoch_Time:152.0min: [2023-12-11 09:05:44,318][model2_sft.py][INFO] Epoch:[0/2](22350/63764) loss:3.956 lr:0.0000124 epoch_Time:152.0min: [2023-12-11 09:05:55,393][model2_sft.py][INFO] Epoch:[0/2](22400/63764) loss:3.955 lr:0.0000124 epoch_Time:152.0min: [2023-12-11 09:06:06,471][model2_sft.py][INFO] Epoch:[0/2](22450/63764) loss:3.588 lr:0.0000123 epoch_Time:152.0min: [2023-12-11 09:06:17,546][model2_sft.py][INFO] Epoch:[0/2](22500/63764) loss:3.557 lr:0.0000123 epoch_Time:152.0min: [2023-12-11 09:06:28,616][model2_sft.py][INFO] Epoch:[0/2](22550/63764) loss:3.289 lr:0.0000123 epoch_Time:151.0min: [2023-12-11 09:06:39,688][model2_sft.py][INFO] Epoch:[0/2](22600/63764) loss:3.535 lr:0.0000123 epoch_Time:151.0min: [2023-12-11 09:06:50,761][model2_sft.py][INFO] Epoch:[0/2](22650/63764) loss:3.524 lr:0.0000122 epoch_Time:151.0min: [2023-12-11 09:07:01,834][model2_sft.py][INFO] Epoch:[0/2](22700/63764) loss:3.573 lr:0.0000122 epoch_Time:151.0min: [2023-12-11 09:07:12,943][model2_sft.py][INFO] Epoch:[0/2](22750/63764) loss:2.980 lr:0.0000122 epoch_Time:151.0min: [2023-12-11 09:07:24,014][model2_sft.py][INFO] Epoch:[0/2](22800/63764) loss:3.610 lr:0.0000121 epoch_Time:151.0min: [2023-12-11 09:07:35,084][model2_sft.py][INFO] Epoch:[0/2](22850/63764) loss:3.462 lr:0.0000121 epoch_Time:150.0min: [2023-12-11 09:07:46,173][model2_sft.py][INFO] Epoch:[0/2](22900/63764) loss:3.404 lr:0.0000121 epoch_Time:150.0min: [2023-12-11 09:07:57,269][model2_sft.py][INFO] Epoch:[0/2](22950/63764) loss:2.396 lr:0.0000120 epoch_Time:150.0min: [2023-12-11 09:08:08,337][model2_sft.py][INFO] Epoch:[0/2](23000/63764) loss:4.186 lr:0.0000120 epoch_Time:150.0min: [2023-12-11 09:08:19,493][model2_sft.py][INFO] Epoch:[0/2](23050/63764) loss:3.851 lr:0.0000120 epoch_Time:150.0min: [2023-12-11 09:08:30,589][model2_sft.py][INFO] Epoch:[0/2](23100/63764) loss:3.435 lr:0.0000120 epoch_Time:149.0min: [2023-12-11 09:08:41,665][model2_sft.py][INFO] Epoch:[0/2](23150/63764) loss:4.053 lr:0.0000119 epoch_Time:149.0min: [2023-12-11 09:08:52,742][model2_sft.py][INFO] Epoch:[0/2](23200/63764) loss:3.549 lr:0.0000119 epoch_Time:149.0min: [2023-12-11 09:09:03,828][model2_sft.py][INFO] Epoch:[0/2](23250/63764) loss:3.121 lr:0.0000119 epoch_Time:149.0min: [2023-12-11 09:09:14,929][model2_sft.py][INFO] Epoch:[0/2](23300/63764) loss:2.983 lr:0.0000118 epoch_Time:149.0min: [2023-12-11 09:09:26,026][model2_sft.py][INFO] Epoch:[0/2](23350/63764) loss:3.403 lr:0.0000118 epoch_Time:148.0min: [2023-12-11 09:09:37,095][model2_sft.py][INFO] Epoch:[0/2](23400/63764) loss:3.752 lr:0.0000118 epoch_Time:148.0min: [2023-12-11 09:09:48,135][model2_sft.py][INFO] Epoch:[0/2](23450/63764) loss:3.603 lr:0.0000117 epoch_Time:148.0min: [2023-12-11 09:09:59,263][model2_sft.py][INFO] Epoch:[0/2](23500/63764) loss:3.398 lr:0.0000117 epoch_Time:148.0min: [2023-12-11 09:10:10,314][model2_sft.py][INFO] Epoch:[0/2](23550/63764) loss:3.065 lr:0.0000117 epoch_Time:148.0min: [2023-12-11 09:10:21,343][model2_sft.py][INFO] Epoch:[0/2](23600/63764) loss:3.191 lr:0.0000117 epoch_Time:148.0min: [2023-12-11 09:10:32,425][model2_sft.py][INFO] Epoch:[0/2](23650/63764) loss:3.010 lr:0.0000116 epoch_Time:147.0min: [2023-12-11 09:10:43,504][model2_sft.py][INFO] Epoch:[0/2](23700/63764) loss:3.471 lr:0.0000116 epoch_Time:147.0min: [2023-12-11 09:10:54,590][model2_sft.py][INFO] Epoch:[0/2](23750/63764) loss:3.813 lr:0.0000116 epoch_Time:147.0min: [2023-12-11 09:11:05,689][model2_sft.py][INFO] Epoch:[0/2](23800/63764) loss:3.266 lr:0.0000115 epoch_Time:147.0min: [2023-12-11 09:11:16,810][model2_sft.py][INFO] Epoch:[0/2](23850/63764) loss:3.845 lr:0.0000115 epoch_Time:147.0min: [2023-12-11 09:11:27,882][model2_sft.py][INFO] Epoch:[0/2](23900/63764) loss:3.315 lr:0.0000115 epoch_Time:146.0min: [2023-12-11 09:11:38,974][model2_sft.py][INFO] Epoch:[0/2](23950/63764) loss:3.967 lr:0.0000114 epoch_Time:146.0min: [2023-12-11 09:11:50,074][model2_sft.py][INFO] Epoch:[0/2](24000/63764) loss:3.782 lr:0.0000114 epoch_Time:146.0min: [2023-12-11 09:12:01,167][model2_sft.py][INFO] Epoch:[0/2](24050/63764) loss:3.470 lr:0.0000114 epoch_Time:146.0min: [2023-12-11 09:12:12,277][model2_sft.py][INFO] Epoch:[0/2](24100/63764) loss:3.679 lr:0.0000114 epoch_Time:146.0min: [2023-12-11 09:12:23,377][model2_sft.py][INFO] Epoch:[0/2](24150/63764) loss:4.185 lr:0.0000113 epoch_Time:146.0min: [2023-12-11 09:12:34,495][model2_sft.py][INFO] Epoch:[0/2](24200/63764) loss:3.481 lr:0.0000113 epoch_Time:145.0min: [2023-12-11 09:12:45,582][model2_sft.py][INFO] Epoch:[0/2](24250/63764) loss:3.705 lr:0.0000113 epoch_Time:145.0min: [2023-12-11 09:12:56,726][model2_sft.py][INFO] Epoch:[0/2](24300/63764) loss:3.919 lr:0.0000112 epoch_Time:145.0min: [2023-12-11 09:13:07,835][model2_sft.py][INFO] Epoch:[0/2](24350/63764) loss:2.750 lr:0.0000112 epoch_Time:145.0min: [2023-12-11 09:13:18,962][model2_sft.py][INFO] Epoch:[0/2](24400/63764) loss:3.244 lr:0.0000112 epoch_Time:145.0min: [2023-12-11 09:13:30,053][model2_sft.py][INFO] Epoch:[0/2](24450/63764) loss:3.840 lr:0.0000111 epoch_Time:144.0min: [2023-12-11 09:13:41,108][model2_sft.py][INFO] Epoch:[0/2](24500/63764) loss:3.564 lr:0.0000111 epoch_Time:144.0min: [2023-12-11 09:13:52,217][model2_sft.py][INFO] Epoch:[0/2](24550/63764) loss:4.228 lr:0.0000111 epoch_Time:144.0min: [2023-12-11 09:14:03,294][model2_sft.py][INFO] Epoch:[0/2](24600/63764) loss:3.882 lr:0.0000110 epoch_Time:144.0min: [2023-12-11 09:14:14,407][model2_sft.py][INFO] Epoch:[0/2](24650/63764) loss:3.848 lr:0.0000110 epoch_Time:144.0min: [2023-12-11 09:14:25,462][model2_sft.py][INFO] Epoch:[0/2](24700/63764) loss:3.420 lr:0.0000110 epoch_Time:143.0min: [2023-12-11 09:14:36,562][model2_sft.py][INFO] Epoch:[0/2](24750/63764) loss:3.367 lr:0.0000110 epoch_Time:143.0min: [2023-12-11 09:14:47,659][model2_sft.py][INFO] Epoch:[0/2](24800/63764) loss:3.452 lr:0.0000109 epoch_Time:143.0min: [2023-12-11 09:14:58,784][model2_sft.py][INFO] Epoch:[0/2](24850/63764) loss:3.250 lr:0.0000109 epoch_Time:143.0min: [2023-12-11 09:15:09,972][model2_sft.py][INFO] Epoch:[0/2](24900/63764) loss:3.506 lr:0.0000109 epoch_Time:143.0min: [2023-12-11 09:15:21,109][model2_sft.py][INFO] Epoch:[0/2](24950/63764) loss:3.190 lr:0.0000108 epoch_Time:143.0min: [2023-12-11 09:15:32,231][model2_sft.py][INFO] Epoch:[0/2](25000/63764) loss:2.898 lr:0.0000108 epoch_Time:142.0min: [2023-12-11 09:15:43,336][model2_sft.py][INFO] Epoch:[0/2](25050/63764) loss:4.442 lr:0.0000108 epoch_Time:142.0min: [2023-12-11 09:15:54,439][model2_sft.py][INFO] Epoch:[0/2](25100/63764) loss:3.205 lr:0.0000107 epoch_Time:142.0min: [2023-12-11 09:16:05,485][model2_sft.py][INFO] Epoch:[0/2](25150/63764) loss:3.807 lr:0.0000107 epoch_Time:142.0min: [2023-12-11 09:16:16,582][model2_sft.py][INFO] Epoch:[0/2](25200/63764) loss:2.606 lr:0.0000107 epoch_Time:142.0min: [2023-12-11 09:16:27,718][model2_sft.py][INFO] Epoch:[0/2](25250/63764) loss:3.271 lr:0.0000107 epoch_Time:141.0min: [2023-12-11 09:16:38,807][model2_sft.py][INFO] Epoch:[0/2](25300/63764) loss:3.902 lr:0.0000106 epoch_Time:141.0min: [2023-12-11 09:16:49,919][model2_sft.py][INFO] Epoch:[0/2](25350/63764) loss:3.511 lr:0.0000106 epoch_Time:141.0min: [2023-12-11 09:17:01,021][model2_sft.py][INFO] Epoch:[0/2](25400/63764) loss:3.956 lr:0.0000106 epoch_Time:141.0min: [2023-12-11 09:17:12,107][model2_sft.py][INFO] Epoch:[0/2](25450/63764) loss:3.523 lr:0.0000105 epoch_Time:141.0min: [2023-12-11 09:17:23,223][model2_sft.py][INFO] Epoch:[0/2](25500/63764) loss:3.919 lr:0.0000105 epoch_Time:141.0min: [2023-12-11 09:17:34,307][model2_sft.py][INFO] Epoch:[0/2](25550/63764) loss:4.085 lr:0.0000105 epoch_Time:140.0min: [2023-12-11 09:17:45,432][model2_sft.py][INFO] Epoch:[0/2](25600/63764) loss:3.695 lr:0.0000104 epoch_Time:140.0min: [2023-12-11 09:17:56,560][model2_sft.py][INFO] Epoch:[0/2](25650/63764) loss:3.378 lr:0.0000104 epoch_Time:140.0min: [2023-12-11 09:18:07,669][model2_sft.py][INFO] Epoch:[0/2](25700/63764) loss:4.100 lr:0.0000104 epoch_Time:140.0min: [2023-12-11 09:18:18,805][model2_sft.py][INFO] Epoch:[0/2](25750/63764) loss:3.373 lr:0.0000103 epoch_Time:140.0min: [2023-12-11 09:18:29,962][model2_sft.py][INFO] Epoch:[0/2](25800/63764) loss:4.337 lr:0.0000103 epoch_Time:139.0min: [2023-12-11 09:18:41,043][model2_sft.py][INFO] Epoch:[0/2](25850/63764) loss:3.404 lr:0.0000103 epoch_Time:139.0min: [2023-12-11 09:18:52,156][model2_sft.py][INFO] Epoch:[0/2](25900/63764) loss:3.470 lr:0.0000103 epoch_Time:139.0min: [2023-12-11 09:19:03,285][model2_sft.py][INFO] Epoch:[0/2](25950/63764) loss:3.526 lr:0.0000102 epoch_Time:139.0min: [2023-12-11 09:19:14,374][model2_sft.py][INFO] Epoch:[0/2](26000/63764) loss:3.506 lr:0.0000102 epoch_Time:139.0min: [2023-12-11 09:19:25,441][model2_sft.py][INFO] Epoch:[0/2](26050/63764) loss:3.400 lr:0.0000102 epoch_Time:138.0min: [2023-12-11 09:19:36,530][model2_sft.py][INFO] Epoch:[0/2](26100/63764) loss:3.969 lr:0.0000101 epoch_Time:138.0min: [2023-12-11 09:19:47,621][model2_sft.py][INFO] Epoch:[0/2](26150/63764) loss:3.826 lr:0.0000101 epoch_Time:138.0min: [2023-12-11 09:19:58,714][model2_sft.py][INFO] Epoch:[0/2](26200/63764) loss:3.611 lr:0.0000101 epoch_Time:138.0min: [2023-12-11 09:20:09,814][model2_sft.py][INFO] Epoch:[0/2](26250/63764) loss:3.674 lr:0.0000100 epoch_Time:138.0min: [2023-12-11 09:20:20,888][model2_sft.py][INFO] Epoch:[0/2](26300/63764) loss:3.242 lr:0.0000100 epoch_Time:138.0min: [2023-12-11 09:20:31,944][model2_sft.py][INFO] Epoch:[0/2](26350/63764) loss:3.840 lr:0.0000100 epoch_Time:137.0min: [2023-12-11 09:20:43,030][model2_sft.py][INFO] Epoch:[0/2](26400/63764) loss:3.366 lr:0.0000100 epoch_Time:137.0min: [2023-12-11 09:20:54,150][model2_sft.py][INFO] Epoch:[0/2](26450/63764) loss:3.773 lr:0.0000099 epoch_Time:137.0min: [2023-12-11 09:21:05,278][model2_sft.py][INFO] Epoch:[0/2](26500/63764) loss:3.649 lr:0.0000099 epoch_Time:138.0min: [2023-12-11 09:21:16,388][model2_sft.py][INFO] Epoch:[0/2](26550/63764) loss:2.990 lr:0.0000099 epoch_Time:138.0min: [2023-12-11 09:21:27,459][model2_sft.py][INFO] Epoch:[0/2](26600/63764) loss:3.784 lr:0.0000098 epoch_Time:137.0min: [2023-12-11 09:21:38,541][model2_sft.py][INFO] Epoch:[0/2](26650/63764) loss:3.286 lr:0.0000098 epoch_Time:137.0min: [2023-12-11 09:21:49,665][model2_sft.py][INFO] Epoch:[0/2](26700/63764) loss:3.029 lr:0.0000098 epoch_Time:137.0min: [2023-12-11 09:22:00,740][model2_sft.py][INFO] Epoch:[0/2](26750/63764) loss:3.695 lr:0.0000097 epoch_Time:137.0min: [2023-12-11 09:22:11,921][model2_sft.py][INFO] Epoch:[0/2](26800/63764) loss:4.013 lr:0.0000097 epoch_Time:137.0min: [2023-12-11 09:22:23,060][model2_sft.py][INFO] Epoch:[0/2](26850/63764) loss:3.031 lr:0.0000097 epoch_Time:137.0min: [2023-12-11 09:22:34,150][model2_sft.py][INFO] Epoch:[0/2](26900/63764) loss:4.326 lr:0.0000096 epoch_Time:136.0min: [2023-12-11 09:22:45,246][model2_sft.py][INFO] Epoch:[0/2](26950/63764) loss:3.425 lr:0.0000096 epoch_Time:136.0min: [2023-12-11 09:22:56,348][model2_sft.py][INFO] Epoch:[0/2](27000/63764) loss:3.969 lr:0.0000096 epoch_Time:136.0min: [2023-12-11 09:23:07,440][model2_sft.py][INFO] Epoch:[0/2](27050/63764) loss:3.463 lr:0.0000096 epoch_Time:136.0min: [2023-12-11 09:23:18,578][model2_sft.py][INFO] Epoch:[0/2](27100/63764) loss:3.473 lr:0.0000095 epoch_Time:136.0min: [2023-12-11 09:23:29,693][model2_sft.py][INFO] Epoch:[0/2](27150/63764) loss:3.696 lr:0.0000095 epoch_Time:135.0min: [2023-12-11 09:23:40,758][model2_sft.py][INFO] Epoch:[0/2](27200/63764) loss:2.729 lr:0.0000095 epoch_Time:135.0min: [2023-12-11 09:23:51,838][model2_sft.py][INFO] Epoch:[0/2](27250/63764) loss:3.301 lr:0.0000094 epoch_Time:135.0min: [2023-12-11 09:24:02,920][model2_sft.py][INFO] Epoch:[0/2](27300/63764) loss:3.943 lr:0.0000094 epoch_Time:135.0min: [2023-12-11 09:24:13,998][model2_sft.py][INFO] Epoch:[0/2](27350/63764) loss:2.844 lr:0.0000094 epoch_Time:135.0min: [2023-12-11 09:24:25,057][model2_sft.py][INFO] Epoch:[0/2](27400/63764) loss:3.975 lr:0.0000093 epoch_Time:135.0min: [2023-12-11 09:24:36,151][model2_sft.py][INFO] Epoch:[0/2](27450/63764) loss:3.710 lr:0.0000093 epoch_Time:134.0min: [2023-12-11 09:24:47,234][model2_sft.py][INFO] Epoch:[0/2](27500/63764) loss:3.232 lr:0.0000093 epoch_Time:134.0min: [2023-12-11 09:24:58,342][model2_sft.py][INFO] Epoch:[0/2](27550/63764) loss:3.533 lr:0.0000093 epoch_Time:134.0min: [2023-12-11 09:25:09,431][model2_sft.py][INFO] Epoch:[0/2](27600/63764) loss:3.482 lr:0.0000092 epoch_Time:134.0min: [2023-12-11 09:25:20,511][model2_sft.py][INFO] Epoch:[0/2](27650/63764) loss:3.325 lr:0.0000092 epoch_Time:134.0min: [2023-12-11 09:25:31,576][model2_sft.py][INFO] Epoch:[0/2](27700/63764) loss:3.640 lr:0.0000092 epoch_Time:133.0min: [2023-12-11 09:25:42,694][model2_sft.py][INFO] Epoch:[0/2](27750/63764) loss:3.220 lr:0.0000091 epoch_Time:133.0min: [2023-12-11 09:25:53,754][model2_sft.py][INFO] Epoch:[0/2](27800/63764) loss:3.692 lr:0.0000091 epoch_Time:133.0min: [2023-12-11 09:26:04,899][model2_sft.py][INFO] Epoch:[0/2](27850/63764) loss:2.971 lr:0.0000091 epoch_Time:133.0min: [2023-12-11 09:26:15,963][model2_sft.py][INFO] Epoch:[0/2](27900/63764) loss:3.513 lr:0.0000090 epoch_Time:133.0min: [2023-12-11 09:26:27,064][model2_sft.py][INFO] Epoch:[0/2](27950/63764) loss:3.847 lr:0.0000090 epoch_Time:132.0min: [2023-12-11 09:26:38,140][model2_sft.py][INFO] Epoch:[0/2](28000/63764) loss:3.771 lr:0.0000090 epoch_Time:132.0min: [2023-12-11 09:26:49,235][model2_sft.py][INFO] Epoch:[0/2](28050/63764) loss:3.131 lr:0.0000090 epoch_Time:132.0min: [2023-12-11 09:27:00,267][model2_sft.py][INFO] Epoch:[0/2](28100/63764) loss:3.722 lr:0.0000089 epoch_Time:132.0min: [2023-12-11 09:27:11,341][model2_sft.py][INFO] Epoch:[0/2](28150/63764) loss:2.915 lr:0.0000089 epoch_Time:132.0min: [2023-12-11 09:27:22,418][model2_sft.py][INFO] Epoch:[0/2](28200/63764) loss:3.280 lr:0.0000089 epoch_Time:132.0min: [2023-12-11 09:27:33,502][model2_sft.py][INFO] Epoch:[0/2](28250/63764) loss:3.111 lr:0.0000088 epoch_Time:131.0min: [2023-12-11 09:27:44,560][model2_sft.py][INFO] Epoch:[0/2](28300/63764) loss:4.257 lr:0.0000088 epoch_Time:131.0min: [2023-12-11 09:27:55,607][model2_sft.py][INFO] Epoch:[0/2](28350/63764) loss:3.519 lr:0.0000088 epoch_Time:131.0min: [2023-12-11 09:28:06,722][model2_sft.py][INFO] Epoch:[0/2](28400/63764) loss:3.714 lr:0.0000087 epoch_Time:131.0min: [2023-12-11 09:28:17,802][model2_sft.py][INFO] Epoch:[0/2](28450/63764) loss:3.480 lr:0.0000087 epoch_Time:131.0min: [2023-12-11 09:28:28,860][model2_sft.py][INFO] Epoch:[0/2](28500/63764) loss:3.527 lr:0.0000087 epoch_Time:130.0min: [2023-12-11 09:28:39,943][model2_sft.py][INFO] Epoch:[0/2](28550/63764) loss:3.569 lr:0.0000087 epoch_Time:130.0min: [2023-12-11 09:28:51,042][model2_sft.py][INFO] Epoch:[0/2](28600/63764) loss:3.684 lr:0.0000086 epoch_Time:130.0min: [2023-12-11 09:29:02,197][model2_sft.py][INFO] Epoch:[0/2](28650/63764) loss:4.096 lr:0.0000086 epoch_Time:130.0min: [2023-12-11 09:29:13,288][model2_sft.py][INFO] Epoch:[0/2](28700/63764) loss:3.597 lr:0.0000086 epoch_Time:130.0min: [2023-12-11 09:29:24,398][model2_sft.py][INFO] Epoch:[0/2](28750/63764) loss:3.495 lr:0.0000085 epoch_Time:130.0min: [2023-12-11 09:29:35,483][model2_sft.py][INFO] Epoch:[0/2](28800/63764) loss:3.524 lr:0.0000085 epoch_Time:129.0min: [2023-12-11 09:29:46,617][model2_sft.py][INFO] Epoch:[0/2](28850/63764) loss:3.337 lr:0.0000085 epoch_Time:129.0min: [2023-12-11 09:29:57,694][model2_sft.py][INFO] Epoch:[0/2](28900/63764) loss:4.125 lr:0.0000084 epoch_Time:129.0min: [2023-12-11 09:30:08,889][model2_sft.py][INFO] Epoch:[0/2](28950/63764) loss:3.754 lr:0.0000084 epoch_Time:129.0min: [2023-12-11 09:30:19,948][model2_sft.py][INFO] Epoch:[0/2](29000/63764) loss:3.098 lr:0.0000084 epoch_Time:129.0min: [2023-12-11 09:30:31,069][model2_sft.py][INFO] Epoch:[0/2](29050/63764) loss:3.754 lr:0.0000084 epoch_Time:128.0min: [2023-12-11 09:30:42,158][model2_sft.py][INFO] Epoch:[0/2](29100/63764) loss:3.116 lr:0.0000083 epoch_Time:128.0min: [2023-12-11 09:30:53,236][model2_sft.py][INFO] Epoch:[0/2](29150/63764) loss:3.586 lr:0.0000083 epoch_Time:128.0min: [2023-12-11 09:31:04,354][model2_sft.py][INFO] Epoch:[0/2](29200/63764) loss:3.621 lr:0.0000083 epoch_Time:128.0min: [2023-12-11 09:31:15,474][model2_sft.py][INFO] Epoch:[0/2](29250/63764) loss:3.653 lr:0.0000082 epoch_Time:128.0min: [2023-12-11 09:31:26,557][model2_sft.py][INFO] Epoch:[0/2](29300/63764) loss:3.450 lr:0.0000082 epoch_Time:127.0min: [2023-12-11 09:31:37,672][model2_sft.py][INFO] Epoch:[0/2](29350/63764) loss:2.861 lr:0.0000082 epoch_Time:127.0min: [2023-12-11 09:31:48,808][model2_sft.py][INFO] Epoch:[0/2](29400/63764) loss:3.548 lr:0.0000081 epoch_Time:127.0min: [2023-12-11 09:31:59,930][model2_sft.py][INFO] Epoch:[0/2](29450/63764) loss:3.602 lr:0.0000081 epoch_Time:127.0min: [2023-12-11 09:32:11,026][model2_sft.py][INFO] Epoch:[0/2](29500/63764) loss:3.821 lr:0.0000081 epoch_Time:127.0min: [2023-12-11 09:32:22,132][model2_sft.py][INFO] Epoch:[0/2](29550/63764) loss:4.034 lr:0.0000081 epoch_Time:127.0min: [2023-12-11 09:32:33,262][model2_sft.py][INFO] Epoch:[0/2](29600/63764) loss:3.259 lr:0.0000080 epoch_Time:126.0min: [2023-12-11 09:32:44,346][model2_sft.py][INFO] Epoch:[0/2](29650/63764) loss:3.427 lr:0.0000080 epoch_Time:126.0min: [2023-12-11 09:32:55,451][model2_sft.py][INFO] Epoch:[0/2](29700/63764) loss:3.875 lr:0.0000080 epoch_Time:126.0min: [2023-12-11 09:33:06,572][model2_sft.py][INFO] Epoch:[0/2](29750/63764) loss:3.520 lr:0.0000079 epoch_Time:126.0min: [2023-12-11 09:33:17,647][model2_sft.py][INFO] Epoch:[0/2](29800/63764) loss:4.057 lr:0.0000079 epoch_Time:126.0min: [2023-12-11 09:33:28,754][model2_sft.py][INFO] Epoch:[0/2](29850/63764) loss:2.987 lr:0.0000079 epoch_Time:125.0min: [2023-12-11 09:33:39,902][model2_sft.py][INFO] Epoch:[0/2](29900/63764) loss:2.970 lr:0.0000079 epoch_Time:125.0min: [2023-12-11 09:33:51,069][model2_sft.py][INFO] Epoch:[0/2](29950/63764) loss:3.609 lr:0.0000078 epoch_Time:125.0min: [2023-12-11 09:34:02,191][model2_sft.py][INFO] Epoch:[0/2](30000/63764) loss:3.334 lr:0.0000078 epoch_Time:125.0min: [2023-12-11 09:34:13,295][model2_sft.py][INFO] Epoch:[0/2](30050/63764) loss:3.618 lr:0.0000078 epoch_Time:125.0min: [2023-12-11 09:34:24,388][model2_sft.py][INFO] Epoch:[0/2](30100/63764) loss:3.141 lr:0.0000077 epoch_Time:125.0min: [2023-12-11 09:34:35,451][model2_sft.py][INFO] Epoch:[0/2](30150/63764) loss:4.141 lr:0.0000077 epoch_Time:124.0min: [2023-12-11 09:34:46,559][model2_sft.py][INFO] Epoch:[0/2](30200/63764) loss:3.509 lr:0.0000077 epoch_Time:124.0min: [2023-12-11 09:34:57,676][model2_sft.py][INFO] Epoch:[0/2](30250/63764) loss:4.145 lr:0.0000077 epoch_Time:124.0min: [2023-12-11 09:35:08,797][model2_sft.py][INFO] Epoch:[0/2](30300/63764) loss:3.529 lr:0.0000076 epoch_Time:124.0min: [2023-12-11 09:35:19,885][model2_sft.py][INFO] Epoch:[0/2](30350/63764) loss:3.593 lr:0.0000076 epoch_Time:124.0min: [2023-12-11 09:35:30,977][model2_sft.py][INFO] Epoch:[0/2](30400/63764) loss:3.775 lr:0.0000076 epoch_Time:123.0min: [2023-12-11 09:35:42,145][model2_sft.py][INFO] Epoch:[0/2](30450/63764) loss:3.224 lr:0.0000075 epoch_Time:123.0min: [2023-12-11 09:35:53,334][model2_sft.py][INFO] Epoch:[0/2](30500/63764) loss:3.296 lr:0.0000075 epoch_Time:123.0min: [2023-12-11 09:36:04,505][model2_sft.py][INFO] Epoch:[0/2](30550/63764) loss:3.251 lr:0.0000075 epoch_Time:123.0min: [2023-12-11 09:36:15,656][model2_sft.py][INFO] Epoch:[0/2](30600/63764) loss:3.643 lr:0.0000074 epoch_Time:123.0min: [2023-12-11 09:36:26,756][model2_sft.py][INFO] Epoch:[0/2](30650/63764) loss:2.979 lr:0.0000074 epoch_Time:122.0min: [2023-12-11 09:36:37,846][model2_sft.py][INFO] Epoch:[0/2](30700/63764) loss:3.260 lr:0.0000074 epoch_Time:122.0min: [2023-12-11 09:36:48,959][model2_sft.py][INFO] Epoch:[0/2](30750/63764) loss:3.479 lr:0.0000074 epoch_Time:122.0min: [2023-12-11 09:37:00,118][model2_sft.py][INFO] Epoch:[0/2](30800/63764) loss:4.281 lr:0.0000073 epoch_Time:122.0min: [2023-12-11 09:37:11,264][model2_sft.py][INFO] Epoch:[0/2](30850/63764) loss:3.327 lr:0.0000073 epoch_Time:122.0min: [2023-12-11 09:37:22,327][model2_sft.py][INFO] Epoch:[0/2](30900/63764) loss:4.222 lr:0.0000073 epoch_Time:122.0min: [2023-12-11 09:37:33,377][model2_sft.py][INFO] Epoch:[0/2](30950/63764) loss:3.454 lr:0.0000072 epoch_Time:121.0min: [2023-12-11 09:37:44,485][model2_sft.py][INFO] Epoch:[0/2](31000/63764) loss:2.828 lr:0.0000072 epoch_Time:121.0min: [2023-12-11 09:37:55,584][model2_sft.py][INFO] Epoch:[0/2](31050/63764) loss:4.160 lr:0.0000072 epoch_Time:121.0min: [2023-12-11 09:38:06,776][model2_sft.py][INFO] Epoch:[0/2](31100/63764) loss:3.893 lr:0.0000072 epoch_Time:121.0min: [2023-12-11 09:38:17,914][model2_sft.py][INFO] Epoch:[0/2](31150/63764) loss:3.686 lr:0.0000071 epoch_Time:121.0min: [2023-12-11 09:38:29,025][model2_sft.py][INFO] Epoch:[0/2](31200/63764) loss:3.936 lr:0.0000071 epoch_Time:120.0min: [2023-12-11 09:38:40,117][model2_sft.py][INFO] Epoch:[0/2](31250/63764) loss:3.316 lr:0.0000071 epoch_Time:120.0min: [2023-12-11 09:38:51,161][model2_sft.py][INFO] Epoch:[0/2](31300/63764) loss:3.315 lr:0.0000070 epoch_Time:120.0min: [2023-12-11 09:39:02,256][model2_sft.py][INFO] Epoch:[0/2](31350/63764) loss:3.085 lr:0.0000070 epoch_Time:120.0min: [2023-12-11 09:39:13,300][model2_sft.py][INFO] Epoch:[0/2](31400/63764) loss:3.384 lr:0.0000070 epoch_Time:120.0min: [2023-12-11 09:39:24,389][model2_sft.py][INFO] Epoch:[0/2](31450/63764) loss:3.478 lr:0.0000070 epoch_Time:120.0min: [2023-12-11 09:39:35,457][model2_sft.py][INFO] Epoch:[0/2](31500/63764) loss:3.974 lr:0.0000069 epoch_Time:119.0min: [2023-12-11 09:39:46,517][model2_sft.py][INFO] Epoch:[0/2](31550/63764) loss:3.561 lr:0.0000069 epoch_Time:119.0min: [2023-12-11 09:39:57,618][model2_sft.py][INFO] Epoch:[0/2](31600/63764) loss:3.011 lr:0.0000069 epoch_Time:119.0min: [2023-12-11 09:40:08,742][model2_sft.py][INFO] Epoch:[0/2](31650/63764) loss:3.144 lr:0.0000069 epoch_Time:119.0min: [2023-12-11 09:40:19,896][model2_sft.py][INFO] Epoch:[0/2](31700/63764) loss:3.287 lr:0.0000068 epoch_Time:119.0min: [2023-12-11 09:40:31,078][model2_sft.py][INFO] Epoch:[0/2](31750/63764) loss:3.721 lr:0.0000068 epoch_Time:118.0min: [2023-12-11 09:40:42,181][model2_sft.py][INFO] Epoch:[0/2](31800/63764) loss:3.405 lr:0.0000068 epoch_Time:118.0min: [2023-12-11 09:40:53,287][model2_sft.py][INFO] Epoch:[0/2](31850/63764) loss:3.220 lr:0.0000067 epoch_Time:118.0min: [2023-12-11 09:41:04,421][model2_sft.py][INFO] Epoch:[0/2](31900/63764) loss:3.806 lr:0.0000067 epoch_Time:118.0min: [2023-12-11 09:41:15,538][model2_sft.py][INFO] Epoch:[0/2](31950/63764) loss:3.178 lr:0.0000067 epoch_Time:118.0min: [2023-12-11 09:41:26,642][model2_sft.py][INFO] Epoch:[0/2](32000/63764) loss:4.521 lr:0.0000067 epoch_Time:117.0min: [2023-12-11 09:41:37,707][model2_sft.py][INFO] Epoch:[0/2](32050/63764) loss:3.457 lr:0.0000066 epoch_Time:117.0min: [2023-12-11 09:41:48,824][model2_sft.py][INFO] Epoch:[0/2](32100/63764) loss:3.992 lr:0.0000066 epoch_Time:117.0min: [2023-12-11 09:41:59,973][model2_sft.py][INFO] Epoch:[0/2](32150/63764) loss:3.187 lr:0.0000066 epoch_Time:117.0min: [2023-12-11 09:42:11,133][model2_sft.py][INFO] Epoch:[0/2](32200/63764) loss:3.764 lr:0.0000065 epoch_Time:117.0min: [2023-12-11 09:42:22,241][model2_sft.py][INFO] Epoch:[0/2](32250/63764) loss:3.749 lr:0.0000065 epoch_Time:117.0min: [2023-12-11 09:42:33,327][model2_sft.py][INFO] Epoch:[0/2](32300/63764) loss:3.934 lr:0.0000065 epoch_Time:116.0min: [2023-12-11 09:42:44,514][model2_sft.py][INFO] Epoch:[0/2](32350/63764) loss:3.294 lr:0.0000065 epoch_Time:116.0min: [2023-12-11 09:42:55,650][model2_sft.py][INFO] Epoch:[0/2](32400/63764) loss:3.451 lr:0.0000064 epoch_Time:116.0min: [2023-12-11 09:43:06,804][model2_sft.py][INFO] Epoch:[0/2](32450/63764) loss:2.527 lr:0.0000064 epoch_Time:116.0min: [2023-12-11 09:43:17,967][model2_sft.py][INFO] Epoch:[0/2](32500/63764) loss:3.922 lr:0.0000064 epoch_Time:116.0min: [2023-12-11 09:43:29,062][model2_sft.py][INFO] Epoch:[0/2](32550/63764) loss:3.646 lr:0.0000064 epoch_Time:115.0min: [2023-12-11 09:43:40,190][model2_sft.py][INFO] Epoch:[0/2](32600/63764) loss:3.267 lr:0.0000063 epoch_Time:115.0min: [2023-12-11 09:43:51,319][model2_sft.py][INFO] Epoch:[0/2](32650/63764) loss:3.345 lr:0.0000063 epoch_Time:115.0min: [2023-12-11 09:44:02,441][model2_sft.py][INFO] Epoch:[0/2](32700/63764) loss:3.440 lr:0.0000063 epoch_Time:115.0min: [2023-12-11 09:44:13,487][model2_sft.py][INFO] Epoch:[0/2](32750/63764) loss:2.889 lr:0.0000062 epoch_Time:115.0min: [2023-12-11 09:44:24,594][model2_sft.py][INFO] Epoch:[0/2](32800/63764) loss:3.761 lr:0.0000062 epoch_Time:115.0min: [2023-12-11 09:44:35,768][model2_sft.py][INFO] Epoch:[0/2](32850/63764) loss:3.035 lr:0.0000062 epoch_Time:114.0min: [2023-12-11 09:44:46,908][model2_sft.py][INFO] Epoch:[0/2](32900/63764) loss:2.881 lr:0.0000062 epoch_Time:114.0min: [2023-12-11 09:44:58,051][model2_sft.py][INFO] Epoch:[0/2](32950/63764) loss:3.752 lr:0.0000061 epoch_Time:114.0min: [2023-12-11 09:45:09,177][model2_sft.py][INFO] Epoch:[0/2](33000/63764) loss:3.218 lr:0.0000061 epoch_Time:114.0min: [2023-12-11 09:45:20,315][model2_sft.py][INFO] Epoch:[0/2](33050/63764) loss:3.674 lr:0.0000061 epoch_Time:114.0min: [2023-12-11 09:45:31,442][model2_sft.py][INFO] Epoch:[0/2](33100/63764) loss:3.336 lr:0.0000061 epoch_Time:113.0min: [2023-12-11 09:45:42,612][model2_sft.py][INFO] Epoch:[0/2](33150/63764) loss:3.730 lr:0.0000060 epoch_Time:113.0min: [2023-12-11 09:45:53,713][model2_sft.py][INFO] Epoch:[0/2](33200/63764) loss:3.797 lr:0.0000060 epoch_Time:113.0min: [2023-12-11 09:46:04,823][model2_sft.py][INFO] Epoch:[0/2](33250/63764) loss:3.345 lr:0.0000060 epoch_Time:113.0min: [2023-12-11 09:46:15,916][model2_sft.py][INFO] Epoch:[0/2](33300/63764) loss:3.284 lr:0.0000059 epoch_Time:113.0min: [2023-12-11 09:46:27,066][model2_sft.py][INFO] Epoch:[0/2](33350/63764) loss:3.406 lr:0.0000059 epoch_Time:112.0min: [2023-12-11 09:46:38,152][model2_sft.py][INFO] Epoch:[0/2](33400/63764) loss:3.505 lr:0.0000059 epoch_Time:112.0min: [2023-12-11 09:46:49,308][model2_sft.py][INFO] Epoch:[0/2](33450/63764) loss:3.680 lr:0.0000059 epoch_Time:112.0min: [2023-12-11 09:47:00,439][model2_sft.py][INFO] Epoch:[0/2](33500/63764) loss:3.194 lr:0.0000058 epoch_Time:112.0min: [2023-12-11 09:47:11,554][model2_sft.py][INFO] Epoch:[0/2](33550/63764) loss:3.116 lr:0.0000058 epoch_Time:112.0min: [2023-12-11 09:47:22,640][model2_sft.py][INFO] Epoch:[0/2](33600/63764) loss:3.453 lr:0.0000058 epoch_Time:112.0min: [2023-12-11 09:47:33,701][model2_sft.py][INFO] Epoch:[0/2](33650/63764) loss:3.047 lr:0.0000058 epoch_Time:111.0min: [2023-12-11 09:47:44,796][model2_sft.py][INFO] Epoch:[0/2](33700/63764) loss:3.869 lr:0.0000057 epoch_Time:111.0min: [2023-12-11 09:47:55,925][model2_sft.py][INFO] Epoch:[0/2](33750/63764) loss:2.898 lr:0.0000057 epoch_Time:111.0min: [2023-12-11 09:48:07,019][model2_sft.py][INFO] Epoch:[0/2](33800/63764) loss:3.194 lr:0.0000057 epoch_Time:111.0min: [2023-12-11 09:48:18,156][model2_sft.py][INFO] Epoch:[0/2](33850/63764) loss:3.104 lr:0.0000057 epoch_Time:111.0min: [2023-12-11 09:48:29,278][model2_sft.py][INFO] Epoch:[0/2](33900/63764) loss:3.669 lr:0.0000056 epoch_Time:110.0min: [2023-12-11 09:48:40,383][model2_sft.py][INFO] Epoch:[0/2](33950/63764) loss:3.490 lr:0.0000056 epoch_Time:110.0min: [2023-12-11 09:48:51,477][model2_sft.py][INFO] Epoch:[0/2](34000/63764) loss:3.385 lr:0.0000056 epoch_Time:110.0min: [2023-12-11 09:49:02,582][model2_sft.py][INFO] Epoch:[0/2](34050/63764) loss:4.085 lr:0.0000055 epoch_Time:110.0min: [2023-12-11 09:49:13,719][model2_sft.py][INFO] Epoch:[0/2](34100/63764) loss:3.611 lr:0.0000055 epoch_Time:110.0min: [2023-12-11 09:49:24,848][model2_sft.py][INFO] Epoch:[0/2](34150/63764) loss:3.614 lr:0.0000055 epoch_Time:110.0min: [2023-12-11 09:49:35,933][model2_sft.py][INFO] Epoch:[0/2](34200/63764) loss:3.132 lr:0.0000055 epoch_Time:109.0min: [2023-12-11 09:49:47,068][model2_sft.py][INFO] Epoch:[0/2](34250/63764) loss:3.057 lr:0.0000054 epoch_Time:109.0min: [2023-12-11 09:49:58,146][model2_sft.py][INFO] Epoch:[0/2](34300/63764) loss:3.266 lr:0.0000054 epoch_Time:109.0min: [2023-12-11 09:50:09,355][model2_sft.py][INFO] Epoch:[0/2](34350/63764) loss:3.024 lr:0.0000054 epoch_Time:109.0min: [2023-12-11 09:50:20,524][model2_sft.py][INFO] Epoch:[0/2](34400/63764) loss:3.893 lr:0.0000054 epoch_Time:109.0min: [2023-12-11 09:50:31,764][model2_sft.py][INFO] Epoch:[0/2](34450/63764) loss:4.161 lr:0.0000053 epoch_Time:108.0min: [2023-12-11 09:50:42,997][model2_sft.py][INFO] Epoch:[0/2](34500/63764) loss:3.083 lr:0.0000053 epoch_Time:108.0min: [2023-12-11 09:50:54,094][model2_sft.py][INFO] Epoch:[0/2](34550/63764) loss:3.272 lr:0.0000053 epoch_Time:108.0min: [2023-12-11 09:51:05,294][model2_sft.py][INFO] Epoch:[0/2](34600/63764) loss:3.656 lr:0.0000053 epoch_Time:108.0min: [2023-12-11 09:51:16,472][model2_sft.py][INFO] Epoch:[0/2](34650/63764) loss:3.417 lr:0.0000052 epoch_Time:108.0min: [2023-12-11 09:51:27,612][model2_sft.py][INFO] Epoch:[0/2](34700/63764) loss:2.972 lr:0.0000052 epoch_Time:107.0min: [2023-12-11 09:51:38,774][model2_sft.py][INFO] Epoch:[0/2](34750/63764) loss:4.096 lr:0.0000052 epoch_Time:107.0min: [2023-12-11 09:51:49,936][model2_sft.py][INFO] Epoch:[0/2](34800/63764) loss:3.079 lr:0.0000052 epoch_Time:107.0min: [2023-12-11 09:52:01,163][model2_sft.py][INFO] Epoch:[0/2](34850/63764) loss:3.657 lr:0.0000051 epoch_Time:107.0min: [2023-12-11 09:52:12,288][model2_sft.py][INFO] Epoch:[0/2](34900/63764) loss:3.408 lr:0.0000051 epoch_Time:107.0min: [2023-12-11 09:52:23,385][model2_sft.py][INFO] Epoch:[0/2](34950/63764) loss:3.113 lr:0.0000051 epoch_Time:107.0min: [2023-12-11 09:52:34,520][model2_sft.py][INFO] Epoch:[0/2](35000/63764) loss:3.293 lr:0.0000051 epoch_Time:106.0min: [2023-12-11 09:52:45,635][model2_sft.py][INFO] Epoch:[0/2](35050/63764) loss:3.444 lr:0.0000050 epoch_Time:106.0min: [2023-12-11 09:52:56,767][model2_sft.py][INFO] Epoch:[0/2](35100/63764) loss:3.530 lr:0.0000050 epoch_Time:106.0min: [2023-12-11 09:53:07,854][model2_sft.py][INFO] Epoch:[0/2](35150/63764) loss:3.292 lr:0.0000050 epoch_Time:106.0min: [2023-12-11 09:53:18,925][model2_sft.py][INFO] Epoch:[0/2](35200/63764) loss:3.149 lr:0.0000050 epoch_Time:106.0min: [2023-12-11 09:53:30,075][model2_sft.py][INFO] Epoch:[0/2](35250/63764) loss:3.078 lr:0.0000049 epoch_Time:105.0min: [2023-12-11 09:53:41,185][model2_sft.py][INFO] Epoch:[0/2](35300/63764) loss:3.595 lr:0.0000049 epoch_Time:105.0min: [2023-12-11 09:53:52,281][model2_sft.py][INFO] Epoch:[0/2](35350/63764) loss:3.733 lr:0.0000049 epoch_Time:105.0min: [2023-12-11 09:54:03,417][model2_sft.py][INFO] Epoch:[0/2](35400/63764) loss:3.579 lr:0.0000049 epoch_Time:105.0min: [2023-12-11 09:54:14,511][model2_sft.py][INFO] Epoch:[0/2](35450/63764) loss:3.901 lr:0.0000048 epoch_Time:105.0min: [2023-12-11 09:54:25,633][model2_sft.py][INFO] Epoch:[0/2](35500/63764) loss:4.008 lr:0.0000048 epoch_Time:104.0min: [2023-12-11 09:54:36,772][model2_sft.py][INFO] Epoch:[0/2](35550/63764) loss:3.494 lr:0.0000048 epoch_Time:104.0min: [2023-12-11 09:54:47,933][model2_sft.py][INFO] Epoch:[0/2](35600/63764) loss:3.195 lr:0.0000048 epoch_Time:104.0min: [2023-12-11 09:54:59,069][model2_sft.py][INFO] Epoch:[0/2](35650/63764) loss:3.643 lr:0.0000047 epoch_Time:104.0min: [2023-12-11 09:55:10,198][model2_sft.py][INFO] Epoch:[0/2](35700/63764) loss:2.933 lr:0.0000047 epoch_Time:104.0min: [2023-12-11 09:55:21,386][model2_sft.py][INFO] Epoch:[0/2](35750/63764) loss:3.432 lr:0.0000047 epoch_Time:104.0min: [2023-12-11 09:55:32,522][model2_sft.py][INFO] Epoch:[0/2](35800/63764) loss:3.283 lr:0.0000047 epoch_Time:103.0min: [2023-12-11 09:55:43,619][model2_sft.py][INFO] Epoch:[0/2](35850/63764) loss:3.692 lr:0.0000046 epoch_Time:103.0min: [2023-12-11 09:55:54,728][model2_sft.py][INFO] Epoch:[0/2](35900/63764) loss:4.038 lr:0.0000046 epoch_Time:103.0min: [2023-12-11 09:56:05,853][model2_sft.py][INFO] Epoch:[0/2](35950/63764) loss:3.263 lr:0.0000046 epoch_Time:103.0min: [2023-12-11 09:56:16,950][model2_sft.py][INFO] Epoch:[0/2](36000/63764) loss:2.918 lr:0.0000046 epoch_Time:103.0min: [2023-12-11 09:56:28,119][model2_sft.py][INFO] Epoch:[0/2](36050/63764) loss:3.666 lr:0.0000046 epoch_Time:102.0min: [2023-12-11 09:56:39,293][model2_sft.py][INFO] Epoch:[0/2](36100/63764) loss:3.442 lr:0.0000045 epoch_Time:102.0min: [2023-12-11 09:56:50,390][model2_sft.py][INFO] Epoch:[0/2](36150/63764) loss:3.695 lr:0.0000045 epoch_Time:102.0min: [2023-12-11 09:57:01,501][model2_sft.py][INFO] Epoch:[0/2](36200/63764) loss:3.715 lr:0.0000045 epoch_Time:102.0min: [2023-12-11 09:57:12,626][model2_sft.py][INFO] Epoch:[0/2](36250/63764) loss:3.157 lr:0.0000045 epoch_Time:102.0min: [2023-12-11 09:57:23,741][model2_sft.py][INFO] Epoch:[0/2](36300/63764) loss:4.175 lr:0.0000044 epoch_Time:102.0min: [2023-12-11 09:57:34,878][model2_sft.py][INFO] Epoch:[0/2](36350/63764) loss:3.127 lr:0.0000044 epoch_Time:101.0min: [2023-12-11 09:57:46,005][model2_sft.py][INFO] Epoch:[0/2](36400/63764) loss:3.061 lr:0.0000044 epoch_Time:101.0min: [2023-12-11 09:57:57,167][model2_sft.py][INFO] Epoch:[0/2](36450/63764) loss:3.763 lr:0.0000044 epoch_Time:101.0min: [2023-12-11 09:58:08,295][model2_sft.py][INFO] Epoch:[0/2](36500/63764) loss:3.897 lr:0.0000043 epoch_Time:101.0min: [2023-12-11 09:58:19,428][model2_sft.py][INFO] Epoch:[0/2](36550/63764) loss:3.362 lr:0.0000043 epoch_Time:101.0min: [2023-12-11 09:58:30,539][model2_sft.py][INFO] Epoch:[0/2](36600/63764) loss:3.434 lr:0.0000043 epoch_Time:100.0min: [2023-12-11 09:58:41,686][model2_sft.py][INFO] Epoch:[0/2](36650/63764) loss:4.198 lr:0.0000043 epoch_Time:100.0min: [2023-12-11 09:58:52,788][model2_sft.py][INFO] Epoch:[0/2](36700/63764) loss:3.346 lr:0.0000042 epoch_Time:100.0min: [2023-12-11 09:59:03,938][model2_sft.py][INFO] Epoch:[0/2](36750/63764) loss:3.599 lr:0.0000042 epoch_Time:100.0min: [2023-12-11 09:59:15,074][model2_sft.py][INFO] Epoch:[0/2](36800/63764) loss:3.697 lr:0.0000042 epoch_Time:100.0min: [2023-12-11 09:59:26,205][model2_sft.py][INFO] Epoch:[0/2](36850/63764) loss:3.177 lr:0.0000042 epoch_Time:99.0min: [2023-12-11 09:59:37,357][model2_sft.py][INFO] Epoch:[0/2](36900/63764) loss:3.634 lr:0.0000042 epoch_Time:99.0min: [2023-12-11 09:59:48,495][model2_sft.py][INFO] Epoch:[0/2](36950/63764) loss:3.471 lr:0.0000041 epoch_Time:99.0min: [2023-12-11 09:59:59,605][model2_sft.py][INFO] Epoch:[0/2](37000/63764) loss:2.831 lr:0.0000041 epoch_Time:99.0min: [2023-12-11 10:00:10,787][model2_sft.py][INFO] Epoch:[0/2](37050/63764) loss:3.923 lr:0.0000041 epoch_Time:99.0min: [2023-12-11 10:00:21,926][model2_sft.py][INFO] Epoch:[0/2](37100/63764) loss:3.824 lr:0.0000041 epoch_Time:99.0min: [2023-12-11 10:00:33,085][model2_sft.py][INFO] Epoch:[0/2](37150/63764) loss:3.841 lr:0.0000040 epoch_Time:98.0min: [2023-12-11 10:00:44,257][model2_sft.py][INFO] Epoch:[0/2](37200/63764) loss:3.656 lr:0.0000040 epoch_Time:98.0min: [2023-12-11 10:00:55,382][model2_sft.py][INFO] Epoch:[0/2](37250/63764) loss:4.113 lr:0.0000040 epoch_Time:98.0min: [2023-12-11 10:01:06,516][model2_sft.py][INFO] Epoch:[0/2](37300/63764) loss:3.089 lr:0.0000040 epoch_Time:98.0min: [2023-12-11 10:01:17,675][model2_sft.py][INFO] Epoch:[0/2](37350/63764) loss:3.359 lr:0.0000040 epoch_Time:98.0min: [2023-12-11 10:01:28,837][model2_sft.py][INFO] Epoch:[0/2](37400/63764) loss:3.705 lr:0.0000039 epoch_Time:97.0min: [2023-12-11 10:01:39,966][model2_sft.py][INFO] Epoch:[0/2](37450/63764) loss:3.298 lr:0.0000039 epoch_Time:97.0min: [2023-12-11 10:01:51,132][model2_sft.py][INFO] Epoch:[0/2](37500/63764) loss:3.465 lr:0.0000039 epoch_Time:97.0min: [2023-12-11 10:02:02,316][model2_sft.py][INFO] Epoch:[0/2](37550/63764) loss:3.333 lr:0.0000039 epoch_Time:97.0min: [2023-12-11 10:02:13,523][model2_sft.py][INFO] Epoch:[0/2](37600/63764) loss:3.341 lr:0.0000038 epoch_Time:97.0min: [2023-12-11 10:02:24,676][model2_sft.py][INFO] Epoch:[0/2](37650/63764) loss:3.248 lr:0.0000038 epoch_Time:97.0min: [2023-12-11 10:02:35,803][model2_sft.py][INFO] Epoch:[0/2](37700/63764) loss:3.266 lr:0.0000038 epoch_Time:96.0min: [2023-12-11 10:02:46,907][model2_sft.py][INFO] Epoch:[0/2](37750/63764) loss:3.905 lr:0.0000038 epoch_Time:96.0min: [2023-12-11 10:02:58,039][model2_sft.py][INFO] Epoch:[0/2](37800/63764) loss:2.942 lr:0.0000038 epoch_Time:96.0min: [2023-12-11 10:03:09,200][model2_sft.py][INFO] Epoch:[0/2](37850/63764) loss:3.308 lr:0.0000037 epoch_Time:96.0min: [2023-12-11 10:03:20,319][model2_sft.py][INFO] Epoch:[0/2](37900/63764) loss:3.714 lr:0.0000037 epoch_Time:96.0min: [2023-12-11 10:03:31,439][model2_sft.py][INFO] Epoch:[0/2](37950/63764) loss:2.809 lr:0.0000037 epoch_Time:95.0min: [2023-12-11 10:03:42,556][model2_sft.py][INFO] Epoch:[0/2](38000/63764) loss:3.244 lr:0.0000037 epoch_Time:95.0min: [2023-12-11 10:03:53,681][model2_sft.py][INFO] Epoch:[0/2](38050/63764) loss:2.965 lr:0.0000037 epoch_Time:95.0min: [2023-12-11 10:04:04,895][model2_sft.py][INFO] Epoch:[0/2](38100/63764) loss:3.214 lr:0.0000036 epoch_Time:95.0min: [2023-12-11 10:04:16,034][model2_sft.py][INFO] Epoch:[0/2](38150/63764) loss:3.230 lr:0.0000036 epoch_Time:95.0min: [2023-12-11 10:04:27,208][model2_sft.py][INFO] Epoch:[0/2](38200/63764) loss:2.955 lr:0.0000036 epoch_Time:94.0min: [2023-12-11 10:04:38,333][model2_sft.py][INFO] Epoch:[0/2](38250/63764) loss:3.160 lr:0.0000036 epoch_Time:94.0min: [2023-12-11 10:04:49,486][model2_sft.py][INFO] Epoch:[0/2](38300/63764) loss:3.454 lr:0.0000035 epoch_Time:94.0min: [2023-12-11 10:05:00,621][model2_sft.py][INFO] Epoch:[0/2](38350/63764) loss:3.340 lr:0.0000035 epoch_Time:94.0min: [2023-12-11 10:05:11,769][model2_sft.py][INFO] Epoch:[0/2](38400/63764) loss:4.165 lr:0.0000035 epoch_Time:94.0min: [2023-12-11 10:05:22,966][model2_sft.py][INFO] Epoch:[0/2](38450/63764) loss:2.846 lr:0.0000035 epoch_Time:94.0min: [2023-12-11 10:05:34,164][model2_sft.py][INFO] Epoch:[0/2](38500/63764) loss:3.951 lr:0.0000035 epoch_Time:93.0min: [2023-12-11 10:05:45,318][model2_sft.py][INFO] Epoch:[0/2](38550/63764) loss:3.550 lr:0.0000034 epoch_Time:93.0min: [2023-12-11 10:05:56,578][model2_sft.py][INFO] Epoch:[0/2](38600/63764) loss:4.226 lr:0.0000034 epoch_Time:93.0min: [2023-12-11 10:06:07,846][model2_sft.py][INFO] Epoch:[0/2](38650/63764) loss:3.919 lr:0.0000034 epoch_Time:93.0min: [2023-12-11 10:06:19,038][model2_sft.py][INFO] Epoch:[0/2](38700/63764) loss:3.258 lr:0.0000034 epoch_Time:93.0min: [2023-12-11 10:06:30,214][model2_sft.py][INFO] Epoch:[0/2](38750/63764) loss:3.487 lr:0.0000034 epoch_Time:92.0min: [2023-12-11 10:06:41,360][model2_sft.py][INFO] Epoch:[0/2](38800/63764) loss:3.019 lr:0.0000033 epoch_Time:92.0min: [2023-12-11 10:06:52,477][model2_sft.py][INFO] Epoch:[0/2](38850/63764) loss:3.756 lr:0.0000033 epoch_Time:92.0min: [2023-12-11 10:07:03,658][model2_sft.py][INFO] Epoch:[0/2](38900/63764) loss:3.140 lr:0.0000033 epoch_Time:92.0min: [2023-12-11 10:07:14,768][model2_sft.py][INFO] Epoch:[0/2](38950/63764) loss:3.581 lr:0.0000033 epoch_Time:92.0min: [2023-12-11 10:07:25,826][model2_sft.py][INFO] Epoch:[0/2](39000/63764) loss:3.325 lr:0.0000033 epoch_Time:91.0min: [2023-12-11 10:07:36,967][model2_sft.py][INFO] Epoch:[0/2](39050/63764) loss:3.752 lr:0.0000032 epoch_Time:91.0min: [2023-12-11 10:07:48,022][model2_sft.py][INFO] Epoch:[0/2](39100/63764) loss:3.655 lr:0.0000032 epoch_Time:91.0min: [2023-12-11 10:07:59,127][model2_sft.py][INFO] Epoch:[0/2](39150/63764) loss:3.029 lr:0.0000032 epoch_Time:91.0min: [2023-12-11 10:08:10,256][model2_sft.py][INFO] Epoch:[0/2](39200/63764) loss:3.194 lr:0.0000032 epoch_Time:91.0min: [2023-12-11 10:08:21,369][model2_sft.py][INFO] Epoch:[0/2](39250/63764) loss:3.588 lr:0.0000032 epoch_Time:91.0min: [2023-12-11 10:08:32,497][model2_sft.py][INFO] Epoch:[0/2](39300/63764) loss:3.754 lr:0.0000031 epoch_Time:90.0min: [2023-12-11 10:08:43,588][model2_sft.py][INFO] Epoch:[0/2](39350/63764) loss:3.371 lr:0.0000031 epoch_Time:90.0min: [2023-12-11 10:08:54,691][model2_sft.py][INFO] Epoch:[0/2](39400/63764) loss:3.914 lr:0.0000031 epoch_Time:90.0min: [2023-12-11 10:09:05,822][model2_sft.py][INFO] Epoch:[0/2](39450/63764) loss:3.686 lr:0.0000031 epoch_Time:90.0min: [2023-12-11 10:09:16,962][model2_sft.py][INFO] Epoch:[0/2](39500/63764) loss:3.884 lr:0.0000031 epoch_Time:90.0min: [2023-12-11 10:09:28,093][model2_sft.py][INFO] Epoch:[0/2](39550/63764) loss:3.503 lr:0.0000031 epoch_Time:89.0min: [2023-12-11 10:09:39,170][model2_sft.py][INFO] Epoch:[0/2](39600/63764) loss:3.615 lr:0.0000030 epoch_Time:89.0min: [2023-12-11 10:09:50,278][model2_sft.py][INFO] Epoch:[0/2](39650/63764) loss:3.364 lr:0.0000030 epoch_Time:89.0min: [2023-12-11 10:10:01,382][model2_sft.py][INFO] Epoch:[0/2](39700/63764) loss:2.726 lr:0.0000030 epoch_Time:89.0min: [2023-12-11 10:10:12,484][model2_sft.py][INFO] Epoch:[0/2](39750/63764) loss:3.474 lr:0.0000030 epoch_Time:89.0min: [2023-12-11 10:10:23,589][model2_sft.py][INFO] Epoch:[0/2](39800/63764) loss:3.037 lr:0.0000030 epoch_Time:89.0min: [2023-12-11 10:10:34,716][model2_sft.py][INFO] Epoch:[0/2](39850/63764) loss:3.266 lr:0.0000029 epoch_Time:88.0min: [2023-12-11 10:10:45,836][model2_sft.py][INFO] Epoch:[0/2](39900/63764) loss:3.077 lr:0.0000029 epoch_Time:88.0min: [2023-12-11 10:10:56,953][model2_sft.py][INFO] Epoch:[0/2](39950/63764) loss:3.302 lr:0.0000029 epoch_Time:88.0min: [2023-12-11 10:11:08,113][model2_sft.py][INFO] Epoch:[0/2](40000/63764) loss:3.765 lr:0.0000029 epoch_Time:88.0min: [2023-12-11 10:11:19,248][model2_sft.py][INFO] Epoch:[0/2](40050/63764) loss:3.453 lr:0.0000029 epoch_Time:88.0min: [2023-12-11 10:11:30,368][model2_sft.py][INFO] Epoch:[0/2](40100/63764) loss:2.994 lr:0.0000029 epoch_Time:87.0min: [2023-12-11 10:11:41,497][model2_sft.py][INFO] Epoch:[0/2](40150/63764) loss:3.793 lr:0.0000028 epoch_Time:87.0min: [2023-12-11 10:11:52,672][model2_sft.py][INFO] Epoch:[0/2](40200/63764) loss:3.349 lr:0.0000028 epoch_Time:87.0min: [2023-12-11 10:12:03,788][model2_sft.py][INFO] Epoch:[0/2](40250/63764) loss:3.045 lr:0.0000028 epoch_Time:87.0min: [2023-12-11 10:12:14,993][model2_sft.py][INFO] Epoch:[0/2](40300/63764) loss:3.848 lr:0.0000028 epoch_Time:87.0min: [2023-12-11 10:12:26,199][model2_sft.py][INFO] Epoch:[0/2](40350/63764) loss:3.447 lr:0.0000028 epoch_Time:86.0min: [2023-12-11 10:12:37,396][model2_sft.py][INFO] Epoch:[0/2](40400/63764) loss:3.136 lr:0.0000027 epoch_Time:86.0min: [2023-12-11 10:12:48,606][model2_sft.py][INFO] Epoch:[0/2](40450/63764) loss:3.388 lr:0.0000027 epoch_Time:86.0min: [2023-12-11 10:12:59,789][model2_sft.py][INFO] Epoch:[0/2](40500/63764) loss:3.919 lr:0.0000027 epoch_Time:86.0min: [2023-12-11 10:13:11,022][model2_sft.py][INFO] Epoch:[0/2](40550/63764) loss:3.472 lr:0.0000027 epoch_Time:86.0min: [2023-12-11 10:13:22,276][model2_sft.py][INFO] Epoch:[0/2](40600/63764) loss:3.406 lr:0.0000027 epoch_Time:86.0min: [2023-12-11 10:13:33,726][model2_sft.py][INFO] Epoch:[0/2](40650/63764) loss:2.928 lr:0.0000027 epoch_Time:85.0min: [2023-12-11 10:13:45,250][model2_sft.py][INFO] Epoch:[0/2](40700/63764) loss:3.728 lr:0.0000026 epoch_Time:85.0min: [2023-12-11 10:13:56,539][model2_sft.py][INFO] Epoch:[0/2](40750/63764) loss:3.283 lr:0.0000026 epoch_Time:85.0min: [2023-12-11 10:14:07,882][model2_sft.py][INFO] Epoch:[0/2](40800/63764) loss:2.940 lr:0.0000026 epoch_Time:85.0min: [2023-12-11 10:14:19,143][model2_sft.py][INFO] Epoch:[0/2](40850/63764) loss:3.798 lr:0.0000026 epoch_Time:85.0min: [2023-12-11 10:14:30,377][model2_sft.py][INFO] Epoch:[0/2](40900/63764) loss:3.922 lr:0.0000026 epoch_Time:84.0min: [2023-12-11 10:14:41,647][model2_sft.py][INFO] Epoch:[0/2](40950/63764) loss:3.740 lr:0.0000026 epoch_Time:84.0min: [2023-12-11 10:14:52,902][model2_sft.py][INFO] Epoch:[0/2](41000/63764) loss:3.642 lr:0.0000025 epoch_Time:84.0min: [2023-12-11 10:15:04,129][model2_sft.py][INFO] Epoch:[0/2](41050/63764) loss:3.376 lr:0.0000025 epoch_Time:84.0min: [2023-12-11 10:15:15,372][model2_sft.py][INFO] Epoch:[0/2](41100/63764) loss:3.772 lr:0.0000025 epoch_Time:84.0min: [2023-12-11 10:15:26,615][model2_sft.py][INFO] Epoch:[0/2](41150/63764) loss:4.053 lr:0.0000025 epoch_Time:83.0min: [2023-12-11 10:15:37,812][model2_sft.py][INFO] Epoch:[0/2](41200/63764) loss:3.593 lr:0.0000025 epoch_Time:83.0min: [2023-12-11 10:15:49,030][model2_sft.py][INFO] Epoch:[0/2](41250/63764) loss:3.571 lr:0.0000025 epoch_Time:83.0min: [2023-12-11 10:16:00,246][model2_sft.py][INFO] Epoch:[0/2](41300/63764) loss:3.109 lr:0.0000024 epoch_Time:83.0min: [2023-12-11 10:16:11,444][model2_sft.py][INFO] Epoch:[0/2](41350/63764) loss:3.621 lr:0.0000024 epoch_Time:83.0min: [2023-12-11 10:16:22,640][model2_sft.py][INFO] Epoch:[0/2](41400/63764) loss:2.747 lr:0.0000024 epoch_Time:83.0min: [2023-12-11 10:16:33,827][model2_sft.py][INFO] Epoch:[0/2](41450/63764) loss:3.358 lr:0.0000024 epoch_Time:82.0min: [2023-12-11 10:16:45,031][model2_sft.py][INFO] Epoch:[0/2](41500/63764) loss:3.800 lr:0.0000024 epoch_Time:82.0min: [2023-12-11 10:16:56,298][model2_sft.py][INFO] Epoch:[0/2](41550/63764) loss:4.054 lr:0.0000024 epoch_Time:82.0min: [2023-12-11 10:17:07,463][model2_sft.py][INFO] Epoch:[0/2](41600/63764) loss:3.093 lr:0.0000023 epoch_Time:82.0min: [2023-12-11 10:17:18,618][model2_sft.py][INFO] Epoch:[0/2](41650/63764) loss:3.833 lr:0.0000023 epoch_Time:82.0min: [2023-12-11 10:17:29,784][model2_sft.py][INFO] Epoch:[0/2](41700/63764) loss:2.748 lr:0.0000023 epoch_Time:81.0min: [2023-12-11 10:17:40,953][model2_sft.py][INFO] Epoch:[0/2](41750/63764) loss:3.842 lr:0.0000023 epoch_Time:81.0min: [2023-12-11 10:17:52,168][model2_sft.py][INFO] Epoch:[0/2](41800/63764) loss:3.416 lr:0.0000023 epoch_Time:81.0min: [2023-12-11 10:18:03,343][model2_sft.py][INFO] Epoch:[0/2](41850/63764) loss:3.500 lr:0.0000023 epoch_Time:81.0min: [2023-12-11 10:18:14,528][model2_sft.py][INFO] Epoch:[0/2](41900/63764) loss:3.254 lr:0.0000023 epoch_Time:81.0min: [2023-12-11 10:18:25,702][model2_sft.py][INFO] Epoch:[0/2](41950/63764) loss:3.104 lr:0.0000022 epoch_Time:80.0min: [2023-12-11 10:18:36,886][model2_sft.py][INFO] Epoch:[0/2](42000/63764) loss:3.488 lr:0.0000022 epoch_Time:80.0min: [2023-12-11 10:18:48,124][model2_sft.py][INFO] Epoch:[0/2](42050/63764) loss:3.486 lr:0.0000022 epoch_Time:80.0min: [2023-12-11 10:18:59,291][model2_sft.py][INFO] Epoch:[0/2](42100/63764) loss:4.123 lr:0.0000022 epoch_Time:80.0min: [2023-12-11 10:19:10,453][model2_sft.py][INFO] Epoch:[0/2](42150/63764) loss:3.262 lr:0.0000022 epoch_Time:80.0min: [2023-12-11 10:19:21,623][model2_sft.py][INFO] Epoch:[0/2](42200/63764) loss:3.329 lr:0.0000022 epoch_Time:80.0min: [2023-12-11 10:19:32,810][model2_sft.py][INFO] Epoch:[0/2](42250/63764) loss:3.343 lr:0.0000021 epoch_Time:79.0min: [2023-12-11 10:19:43,990][model2_sft.py][INFO] Epoch:[0/2](42300/63764) loss:3.422 lr:0.0000021 epoch_Time:79.0min: [2023-12-11 10:19:55,172][model2_sft.py][INFO] Epoch:[0/2](42350/63764) loss:3.966 lr:0.0000021 epoch_Time:79.0min: [2023-12-11 10:20:06,371][model2_sft.py][INFO] Epoch:[0/2](42400/63764) loss:2.658 lr:0.0000021 epoch_Time:79.0min: [2023-12-11 10:20:17,523][model2_sft.py][INFO] Epoch:[0/2](42450/63764) loss:2.840 lr:0.0000021 epoch_Time:79.0min: [2023-12-11 10:20:28,692][model2_sft.py][INFO] Epoch:[0/2](42500/63764) loss:3.358 lr:0.0000021 epoch_Time:78.0min: [2023-12-11 10:20:39,910][model2_sft.py][INFO] Epoch:[0/2](42550/63764) loss:3.718 lr:0.0000021 epoch_Time:78.0min: [2023-12-11 10:20:51,072][model2_sft.py][INFO] Epoch:[0/2](42600/63764) loss:3.591 lr:0.0000020 epoch_Time:78.0min: [2023-12-11 10:21:02,163][model2_sft.py][INFO] Epoch:[0/2](42650/63764) loss:3.119 lr:0.0000020 epoch_Time:78.0min: [2023-12-11 10:21:13,291][model2_sft.py][INFO] Epoch:[0/2](42700/63764) loss:3.568 lr:0.0000020 epoch_Time:78.0min: [2023-12-11 10:21:24,443][model2_sft.py][INFO] Epoch:[0/2](42750/63764) loss:3.366 lr:0.0000020 epoch_Time:78.0min: [2023-12-11 10:21:35,609][model2_sft.py][INFO] Epoch:[0/2](42800/63764) loss:3.665 lr:0.0000020 epoch_Time:77.0min: [2023-12-11 10:21:46,774][model2_sft.py][INFO] Epoch:[0/2](42850/63764) loss:3.089 lr:0.0000020 epoch_Time:77.0min: [2023-12-11 10:21:57,928][model2_sft.py][INFO] Epoch:[0/2](42900/63764) loss:4.188 lr:0.0000020 epoch_Time:77.0min: [2023-12-11 10:22:09,030][model2_sft.py][INFO] Epoch:[0/2](42950/63764) loss:3.507 lr:0.0000020 epoch_Time:77.0min: [2023-12-11 10:22:20,161][model2_sft.py][INFO] Epoch:[0/2](43000/63764) loss:2.936 lr:0.0000019 epoch_Time:77.0min: [2023-12-11 10:22:31,279][model2_sft.py][INFO] Epoch:[0/2](43050/63764) loss:3.394 lr:0.0000019 epoch_Time:76.0min: [2023-12-11 10:22:42,470][model2_sft.py][INFO] Epoch:[0/2](43100/63764) loss:2.865 lr:0.0000019 epoch_Time:76.0min: [2023-12-11 10:22:53,608][model2_sft.py][INFO] Epoch:[0/2](43150/63764) loss:3.210 lr:0.0000019 epoch_Time:76.0min: [2023-12-11 10:23:04,738][model2_sft.py][INFO] Epoch:[0/2](43200/63764) loss:3.740 lr:0.0000019 epoch_Time:76.0min: [2023-12-11 10:23:15,890][model2_sft.py][INFO] Epoch:[0/2](43250/63764) loss:3.430 lr:0.0000019 epoch_Time:76.0min: [2023-12-11 10:23:27,021][model2_sft.py][INFO] Epoch:[0/2](43300/63764) loss:2.918 lr:0.0000019 epoch_Time:75.0min: [2023-12-11 10:23:38,180][model2_sft.py][INFO] Epoch:[0/2](43350/63764) loss:3.564 lr:0.0000019 epoch_Time:75.0min: [2023-12-11 10:23:49,307][model2_sft.py][INFO] Epoch:[0/2](43400/63764) loss:3.455 lr:0.0000018 epoch_Time:75.0min: [2023-12-11 10:24:00,462][model2_sft.py][INFO] Epoch:[0/2](43450/63764) loss:3.757 lr:0.0000018 epoch_Time:75.0min: [2023-12-11 10:24:11,627][model2_sft.py][INFO] Epoch:[0/2](43500/63764) loss:3.783 lr:0.0000018 epoch_Time:75.0min: [2023-12-11 10:24:22,780][model2_sft.py][INFO] Epoch:[0/2](43550/63764) loss:3.987 lr:0.0000018 epoch_Time:75.0min: [2023-12-11 10:24:33,930][model2_sft.py][INFO] Epoch:[0/2](43600/63764) loss:4.018 lr:0.0000018 epoch_Time:74.0min: [2023-12-11 10:24:45,058][model2_sft.py][INFO] Epoch:[0/2](43650/63764) loss:3.972 lr:0.0000018 epoch_Time:74.0min: [2023-12-11 10:24:56,303][model2_sft.py][INFO] Epoch:[0/2](43700/63764) loss:3.338 lr:0.0000018 epoch_Time:74.0min: [2023-12-11 10:25:07,484][model2_sft.py][INFO] Epoch:[0/2](43750/63764) loss:2.973 lr:0.0000018 epoch_Time:74.0min: [2023-12-11 10:25:18,646][model2_sft.py][INFO] Epoch:[0/2](43800/63764) loss:3.480 lr:0.0000017 epoch_Time:74.0min: [2023-12-11 10:25:29,797][model2_sft.py][INFO] Epoch:[0/2](43850/63764) loss:3.395 lr:0.0000017 epoch_Time:73.0min: [2023-12-11 10:25:40,900][model2_sft.py][INFO] Epoch:[0/2](43900/63764) loss:3.303 lr:0.0000017 epoch_Time:73.0min: [2023-12-11 10:25:52,040][model2_sft.py][INFO] Epoch:[0/2](43950/63764) loss:3.848 lr:0.0000017 epoch_Time:73.0min: [2023-12-11 10:26:03,222][model2_sft.py][INFO] Epoch:[0/2](44000/63764) loss:3.385 lr:0.0000017 epoch_Time:73.0min: [2023-12-11 10:26:14,381][model2_sft.py][INFO] Epoch:[0/2](44050/63764) loss:3.808 lr:0.0000017 epoch_Time:73.0min: [2023-12-11 10:26:25,511][model2_sft.py][INFO] Epoch:[0/2](44100/63764) loss:3.822 lr:0.0000017 epoch_Time:72.0min: [2023-12-11 10:26:36,658][model2_sft.py][INFO] Epoch:[0/2](44150/63764) loss:3.312 lr:0.0000017 epoch_Time:72.0min: [2023-12-11 10:26:47,796][model2_sft.py][INFO] Epoch:[0/2](44200/63764) loss:2.676 lr:0.0000016 epoch_Time:72.0min: [2023-12-11 10:26:58,974][model2_sft.py][INFO] Epoch:[0/2](44250/63764) loss:2.926 lr:0.0000016 epoch_Time:72.0min: [2023-12-11 10:27:10,087][model2_sft.py][INFO] Epoch:[0/2](44300/63764) loss:3.738 lr:0.0000016 epoch_Time:72.0min: [2023-12-11 10:27:21,249][model2_sft.py][INFO] Epoch:[0/2](44350/63764) loss:3.677 lr:0.0000016 epoch_Time:72.0min: [2023-12-11 10:27:32,395][model2_sft.py][INFO] Epoch:[0/2](44400/63764) loss:3.470 lr:0.0000016 epoch_Time:71.0min: [2023-12-11 10:27:43,538][model2_sft.py][INFO] Epoch:[0/2](44450/63764) loss:3.560 lr:0.0000016 epoch_Time:71.0min: [2023-12-11 10:27:54,653][model2_sft.py][INFO] Epoch:[0/2](44500/63764) loss:3.825 lr:0.0000016 epoch_Time:71.0min: [2023-12-11 10:28:05,796][model2_sft.py][INFO] Epoch:[0/2](44550/63764) loss:3.049 lr:0.0000016 epoch_Time:71.0min: [2023-12-11 10:28:16,908][model2_sft.py][INFO] Epoch:[0/2](44600/63764) loss:4.130 lr:0.0000016 epoch_Time:71.0min: [2023-12-11 10:28:28,035][model2_sft.py][INFO] Epoch:[0/2](44650/63764) loss:2.973 lr:0.0000016 epoch_Time:70.0min: [2023-12-11 10:28:39,190][model2_sft.py][INFO] Epoch:[0/2](44700/63764) loss:3.065 lr:0.0000015 epoch_Time:70.0min: [2023-12-11 10:28:50,309][model2_sft.py][INFO] Epoch:[0/2](44750/63764) loss:3.691 lr:0.0000015 epoch_Time:70.0min: [2023-12-11 10:29:01,422][model2_sft.py][INFO] Epoch:[0/2](44800/63764) loss:3.030 lr:0.0000015 epoch_Time:70.0min: [2023-12-11 10:29:12,542][model2_sft.py][INFO] Epoch:[0/2](44850/63764) loss:3.389 lr:0.0000015 epoch_Time:70.0min: [2023-12-11 10:29:23,714][model2_sft.py][INFO] Epoch:[0/2](44900/63764) loss:4.243 lr:0.0000015 epoch_Time:70.0min: [2023-12-11 10:29:34,822][model2_sft.py][INFO] Epoch:[0/2](44950/63764) loss:3.950 lr:0.0000015 epoch_Time:69.0min: [2023-12-11 10:29:45,967][model2_sft.py][INFO] Epoch:[0/2](45000/63764) loss:3.744 lr:0.0000015 epoch_Time:69.0min: [2023-12-11 10:29:57,082][model2_sft.py][INFO] Epoch:[0/2](45050/63764) loss:2.873 lr:0.0000015 epoch_Time:69.0min: [2023-12-11 10:30:08,259][model2_sft.py][INFO] Epoch:[0/2](45100/63764) loss:3.788 lr:0.0000015 epoch_Time:69.0min: [2023-12-11 10:30:19,398][model2_sft.py][INFO] Epoch:[0/2](45150/63764) loss:3.331 lr:0.0000015 epoch_Time:69.0min: [2023-12-11 10:30:30,593][model2_sft.py][INFO] Epoch:[0/2](45200/63764) loss:3.456 lr:0.0000014 epoch_Time:68.0min: [2023-12-11 10:30:41,761][model2_sft.py][INFO] Epoch:[0/2](45250/63764) loss:3.453 lr:0.0000014 epoch_Time:68.0min: [2023-12-11 10:30:52,897][model2_sft.py][INFO] Epoch:[0/2](45300/63764) loss:3.725 lr:0.0000014 epoch_Time:68.0min: [2023-12-11 10:31:04,036][model2_sft.py][INFO] Epoch:[0/2](45350/63764) loss:3.514 lr:0.0000014 epoch_Time:68.0min: [2023-12-11 10:31:15,174][model2_sft.py][INFO] Epoch:[0/2](45400/63764) loss:2.734 lr:0.0000014 epoch_Time:68.0min: [2023-12-11 10:31:26,347][model2_sft.py][INFO] Epoch:[0/2](45450/63764) loss:3.601 lr:0.0000014 epoch_Time:67.0min: [2023-12-11 10:31:37,511][model2_sft.py][INFO] Epoch:[0/2](45500/63764) loss:3.377 lr:0.0000014 epoch_Time:67.0min: [2023-12-11 10:31:48,666][model2_sft.py][INFO] Epoch:[0/2](45550/63764) loss:3.543 lr:0.0000014 epoch_Time:67.0min: [2023-12-11 10:31:59,805][model2_sft.py][INFO] Epoch:[0/2](45600/63764) loss:3.394 lr:0.0000014 epoch_Time:67.0min: [2023-12-11 10:32:11,043][model2_sft.py][INFO] Epoch:[0/2](45650/63764) loss:3.154 lr:0.0000014 epoch_Time:67.0min: [2023-12-11 10:32:22,196][model2_sft.py][INFO] Epoch:[0/2](45700/63764) loss:3.178 lr:0.0000014 epoch_Time:67.0min: [2023-12-11 10:32:33,347][model2_sft.py][INFO] Epoch:[0/2](45750/63764) loss:3.749 lr:0.0000014 epoch_Time:66.0min: [2023-12-11 10:32:44,519][model2_sft.py][INFO] Epoch:[0/2](45800/63764) loss:3.391 lr:0.0000013 epoch_Time:66.0min: [2023-12-11 10:32:55,659][model2_sft.py][INFO] Epoch:[0/2](45850/63764) loss:3.208 lr:0.0000013 epoch_Time:66.0min: [2023-12-11 10:33:06,792][model2_sft.py][INFO] Epoch:[0/2](45900/63764) loss:3.349 lr:0.0000013 epoch_Time:66.0min: [2023-12-11 10:33:17,981][model2_sft.py][INFO] Epoch:[0/2](45950/63764) loss:3.487 lr:0.0000013 epoch_Time:66.0min: [2023-12-11 10:33:29,123][model2_sft.py][INFO] Epoch:[0/2](46000/63764) loss:3.101 lr:0.0000013 epoch_Time:65.0min: [2023-12-11 10:33:40,312][model2_sft.py][INFO] Epoch:[0/2](46050/63764) loss:3.319 lr:0.0000013 epoch_Time:65.0min: [2023-12-11 10:33:51,527][model2_sft.py][INFO] Epoch:[0/2](46100/63764) loss:3.623 lr:0.0000013 epoch_Time:65.0min: [2023-12-11 10:34:02,666][model2_sft.py][INFO] Epoch:[0/2](46150/63764) loss:3.225 lr:0.0000013 epoch_Time:65.0min: [2023-12-11 10:34:13,818][model2_sft.py][INFO] Epoch:[0/2](46200/63764) loss:3.191 lr:0.0000013 epoch_Time:65.0min: [2023-12-11 10:34:24,999][model2_sft.py][INFO] Epoch:[0/2](46250/63764) loss:3.978 lr:0.0000013 epoch_Time:65.0min: [2023-12-11 10:34:36,147][model2_sft.py][INFO] Epoch:[0/2](46300/63764) loss:3.208 lr:0.0000013 epoch_Time:64.0min: [2023-12-11 10:34:47,335][model2_sft.py][INFO] Epoch:[0/2](46350/63764) loss:3.272 lr:0.0000013 epoch_Time:64.0min: [2023-12-11 10:34:58,497][model2_sft.py][INFO] Epoch:[0/2](46400/63764) loss:3.685 lr:0.0000013 epoch_Time:64.0min: [2023-12-11 10:35:09,634][model2_sft.py][INFO] Epoch:[0/2](46450/63764) loss:2.908 lr:0.0000012 epoch_Time:64.0min: [2023-12-11 10:35:20,737][model2_sft.py][INFO] Epoch:[0/2](46500/63764) loss:2.919 lr:0.0000012 epoch_Time:64.0min: [2023-12-11 10:35:31,881][model2_sft.py][INFO] Epoch:[0/2](46550/63764) loss:2.613 lr:0.0000012 epoch_Time:63.0min: [2023-12-11 10:35:43,025][model2_sft.py][INFO] Epoch:[0/2](46600/63764) loss:3.299 lr:0.0000012 epoch_Time:63.0min: [2023-12-11 10:35:54,199][model2_sft.py][INFO] Epoch:[0/2](46650/63764) loss:3.384 lr:0.0000012 epoch_Time:63.0min: [2023-12-11 10:36:05,381][model2_sft.py][INFO] Epoch:[0/2](46700/63764) loss:3.395 lr:0.0000012 epoch_Time:63.0min: [2023-12-11 10:36:16,513][model2_sft.py][INFO] Epoch:[0/2](46750/63764) loss:3.577 lr:0.0000012 epoch_Time:63.0min: [2023-12-11 10:36:27,663][model2_sft.py][INFO] Epoch:[0/2](46800/63764) loss:2.863 lr:0.0000012 epoch_Time:62.0min: [2023-12-11 10:36:38,780][model2_sft.py][INFO] Epoch:[0/2](46850/63764) loss:3.126 lr:0.0000012 epoch_Time:62.0min: [2023-12-11 10:36:49,944][model2_sft.py][INFO] Epoch:[0/2](46900/63764) loss:3.022 lr:0.0000012 epoch_Time:62.0min: [2023-12-11 10:37:01,136][model2_sft.py][INFO] Epoch:[0/2](46950/63764) loss:3.277 lr:0.0000012 epoch_Time:62.0min: [2023-12-11 10:37:12,302][model2_sft.py][INFO] Epoch:[0/2](47000/63764) loss:3.369 lr:0.0000012 epoch_Time:62.0min: [2023-12-11 10:37:23,487][model2_sft.py][INFO] Epoch:[0/2](47050/63764) loss:3.143 lr:0.0000012 epoch_Time:62.0min: [2023-12-11 10:37:34,718][model2_sft.py][INFO] Epoch:[0/2](47100/63764) loss:3.824 lr:0.0000012 epoch_Time:61.0min: [2023-12-11 10:37:45,892][model2_sft.py][INFO] Epoch:[0/2](47150/63764) loss:3.529 lr:0.0000012 epoch_Time:61.0min: [2023-12-11 10:37:57,058][model2_sft.py][INFO] Epoch:[0/2](47200/63764) loss:3.544 lr:0.0000012 epoch_Time:61.0min: [2023-12-11 10:38:08,234][model2_sft.py][INFO] Epoch:[0/2](47250/63764) loss:4.198 lr:0.0000011 epoch_Time:61.0min: [2023-12-11 10:38:19,405][model2_sft.py][INFO] Epoch:[0/2](47300/63764) loss:3.041 lr:0.0000011 epoch_Time:61.0min: [2023-12-11 10:38:30,560][model2_sft.py][INFO] Epoch:[0/2](47350/63764) loss:2.893 lr:0.0000011 epoch_Time:60.0min: [2023-12-11 10:38:41,763][model2_sft.py][INFO] Epoch:[0/2](47400/63764) loss:3.342 lr:0.0000011 epoch_Time:60.0min: [2023-12-11 10:38:52,947][model2_sft.py][INFO] Epoch:[0/2](47450/63764) loss:2.834 lr:0.0000011 epoch_Time:60.0min: [2023-12-11 10:39:04,133][model2_sft.py][INFO] Epoch:[0/2](47500/63764) loss:3.016 lr:0.0000011 epoch_Time:60.0min: [2023-12-11 10:39:15,308][model2_sft.py][INFO] Epoch:[0/2](47550/63764) loss:3.093 lr:0.0000011 epoch_Time:60.0min: [2023-12-11 10:39:26,453][model2_sft.py][INFO] Epoch:[0/2](47600/63764) loss:3.907 lr:0.0000011 epoch_Time:59.0min: [2023-12-11 10:39:37,598][model2_sft.py][INFO] Epoch:[0/2](47650/63764) loss:3.429 lr:0.0000011 epoch_Time:59.0min: [2023-12-11 10:39:48,772][model2_sft.py][INFO] Epoch:[0/2](47700/63764) loss:2.960 lr:0.0000011 epoch_Time:59.0min: [2023-12-11 10:39:59,943][model2_sft.py][INFO] Epoch:[0/2](47750/63764) loss:2.914 lr:0.0000011 epoch_Time:59.0min: [2023-12-11 10:40:11,115][model2_sft.py][INFO] Epoch:[0/2](47800/63764) loss:4.162 lr:0.0000011 epoch_Time:59.0min: [2023-12-11 10:40:22,299][model2_sft.py][INFO] Epoch:[0/2](47850/63764) loss:3.992 lr:0.0000011 epoch_Time:59.0min: [2023-12-11 10:40:33,500][model2_sft.py][INFO] Epoch:[0/2](47900/63764) loss:2.902 lr:0.0000011 epoch_Time:58.0min: [2023-12-11 10:40:44,677][model2_sft.py][INFO] Epoch:[0/2](47950/63764) loss:3.072 lr:0.0000011 epoch_Time:58.0min: [2023-12-11 10:40:55,855][model2_sft.py][INFO] Epoch:[0/2](48000/63764) loss:3.129 lr:0.0000011 epoch_Time:58.0min: [2023-12-11 10:41:07,065][model2_sft.py][INFO] Epoch:[0/2](48050/63764) loss:3.492 lr:0.0000011 epoch_Time:58.0min: [2023-12-11 10:41:18,226][model2_sft.py][INFO] Epoch:[0/2](48100/63764) loss:3.782 lr:0.0000011 epoch_Time:58.0min: [2023-12-11 10:41:29,396][model2_sft.py][INFO] Epoch:[0/2](48150/63764) loss:3.744 lr:0.0000011 epoch_Time:57.0min: [2023-12-11 10:41:40,608][model2_sft.py][INFO] Epoch:[0/2](48200/63764) loss:3.458 lr:0.0000011 epoch_Time:57.0min: [2023-12-11 10:41:51,801][model2_sft.py][INFO] Epoch:[0/2](48250/63764) loss:2.766 lr:0.0000011 epoch_Time:57.0min: [2023-12-11 10:42:03,040][model2_sft.py][INFO] Epoch:[0/2](48300/63764) loss:3.330 lr:0.0000011 epoch_Time:57.0min: [2023-12-11 10:42:14,258][model2_sft.py][INFO] Epoch:[0/2](48350/63764) loss:3.312 lr:0.0000011 epoch_Time:57.0min: [2023-12-11 10:42:25,476][model2_sft.py][INFO] Epoch:[0/2](48400/63764) loss:3.437 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 10:42:36,721][model2_sft.py][INFO] Epoch:[0/2](48450/63764) loss:3.005 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 10:42:47,919][model2_sft.py][INFO] Epoch:[0/2](48500/63764) loss:3.471 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 10:42:59,078][model2_sft.py][INFO] Epoch:[0/2](48550/63764) loss:3.615 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 10:43:10,289][model2_sft.py][INFO] Epoch:[0/2](48600/63764) loss:3.114 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 10:43:21,480][model2_sft.py][INFO] Epoch:[0/2](48650/63764) loss:3.707 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 10:43:32,660][model2_sft.py][INFO] Epoch:[0/2](48700/63764) loss:3.620 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 10:43:43,831][model2_sft.py][INFO] Epoch:[0/2](48750/63764) loss:3.707 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 10:43:54,993][model2_sft.py][INFO] Epoch:[0/2](48800/63764) loss:3.955 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 10:44:06,167][model2_sft.py][INFO] Epoch:[0/2](48850/63764) loss:4.135 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 10:44:17,440][model2_sft.py][INFO] Epoch:[0/2](48900/63764) loss:3.111 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 10:44:28,589][model2_sft.py][INFO] Epoch:[0/2](48950/63764) loss:3.887 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 10:44:39,761][model2_sft.py][INFO] Epoch:[0/2](49000/63764) loss:3.515 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 10:44:50,888][model2_sft.py][INFO] Epoch:[0/2](49050/63764) loss:4.030 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 10:45:02,093][model2_sft.py][INFO] Epoch:[0/2](49100/63764) loss:4.019 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 10:45:13,306][model2_sft.py][INFO] Epoch:[0/2](49150/63764) loss:3.604 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 10:45:24,482][model2_sft.py][INFO] Epoch:[0/2](49200/63764) loss:3.570 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 10:45:35,663][model2_sft.py][INFO] Epoch:[0/2](49250/63764) loss:3.121 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 10:45:46,858][model2_sft.py][INFO] Epoch:[0/2](49300/63764) loss:2.945 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 10:45:58,008][model2_sft.py][INFO] Epoch:[0/2](49350/63764) loss:3.394 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 10:46:09,190][model2_sft.py][INFO] Epoch:[0/2](49400/63764) loss:3.553 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 10:46:20,350][model2_sft.py][INFO] Epoch:[0/2](49450/63764) loss:3.786 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 10:46:31,511][model2_sft.py][INFO] Epoch:[0/2](49500/63764) loss:3.330 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 10:46:42,656][model2_sft.py][INFO] Epoch:[0/2](49550/63764) loss:3.143 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 10:46:53,812][model2_sft.py][INFO] Epoch:[0/2](49600/63764) loss:3.199 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 10:47:05,018][model2_sft.py][INFO] Epoch:[0/2](49650/63764) loss:2.572 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 10:47:16,189][model2_sft.py][INFO] Epoch:[0/2](49700/63764) loss:3.461 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 10:47:27,373][model2_sft.py][INFO] Epoch:[0/2](49750/63764) loss:3.371 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 10:47:38,542][model2_sft.py][INFO] Epoch:[0/2](49800/63764) loss:3.619 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 10:47:49,734][model2_sft.py][INFO] Epoch:[0/2](49850/63764) loss:3.956 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 10:48:00,942][model2_sft.py][INFO] Epoch:[0/2](49900/63764) loss:2.853 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 10:48:12,122][model2_sft.py][INFO] Epoch:[0/2](49950/63764) loss:3.359 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 10:48:23,274][model2_sft.py][INFO] Epoch:[0/2](50000/63764) loss:3.471 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 10:48:34,411][model2_sft.py][INFO] Epoch:[0/2](50050/63764) loss:3.070 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 10:48:45,615][model2_sft.py][INFO] Epoch:[0/2](50100/63764) loss:3.868 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 10:48:56,874][model2_sft.py][INFO] Epoch:[0/2](50150/63764) loss:3.591 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 10:49:08,063][model2_sft.py][INFO] Epoch:[0/2](50200/63764) loss:3.626 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 10:49:19,287][model2_sft.py][INFO] Epoch:[0/2](50250/63764) loss:3.264 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 10:49:30,499][model2_sft.py][INFO] Epoch:[0/2](50300/63764) loss:3.781 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 10:49:41,721][model2_sft.py][INFO] Epoch:[0/2](50350/63764) loss:3.394 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 10:49:52,910][model2_sft.py][INFO] Epoch:[0/2](50400/63764) loss:4.053 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 10:50:04,087][model2_sft.py][INFO] Epoch:[0/2](50450/63764) loss:3.099 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 10:50:15,290][model2_sft.py][INFO] Epoch:[0/2](50500/63764) loss:3.884 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 10:50:26,481][model2_sft.py][INFO] Epoch:[0/2](50550/63764) loss:3.781 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 10:50:37,672][model2_sft.py][INFO] Epoch:[0/2](50600/63764) loss:3.310 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 10:50:48,874][model2_sft.py][INFO] Epoch:[0/2](50650/63764) loss:3.702 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 10:51:00,038][model2_sft.py][INFO] Epoch:[0/2](50700/63764) loss:3.429 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 10:51:11,233][model2_sft.py][INFO] Epoch:[0/2](50750/63764) loss:4.060 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 10:51:22,445][model2_sft.py][INFO] Epoch:[0/2](50800/63764) loss:3.769 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 10:51:33,629][model2_sft.py][INFO] Epoch:[0/2](50850/63764) loss:3.127 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 10:51:44,797][model2_sft.py][INFO] Epoch:[0/2](50900/63764) loss:3.561 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 10:51:55,993][model2_sft.py][INFO] Epoch:[0/2](50950/63764) loss:3.169 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 10:52:07,214][model2_sft.py][INFO] Epoch:[0/2](51000/63764) loss:4.316 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 10:52:18,378][model2_sft.py][INFO] Epoch:[0/2](51050/63764) loss:3.238 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 10:52:29,566][model2_sft.py][INFO] Epoch:[0/2](51100/63764) loss:3.278 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 10:52:40,683][model2_sft.py][INFO] Epoch:[0/2](51150/63764) loss:2.767 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 10:52:51,861][model2_sft.py][INFO] Epoch:[0/2](51200/63764) loss:3.291 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 10:53:03,003][model2_sft.py][INFO] Epoch:[0/2](51250/63764) loss:3.200 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 10:53:14,213][model2_sft.py][INFO] Epoch:[0/2](51300/63764) loss:3.455 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 10:53:25,425][model2_sft.py][INFO] Epoch:[0/2](51350/63764) loss:3.413 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 10:53:36,569][model2_sft.py][INFO] Epoch:[0/2](51400/63764) loss:3.607 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 10:53:47,833][model2_sft.py][INFO] Epoch:[0/2](51450/63764) loss:2.963 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 10:53:58,969][model2_sft.py][INFO] Epoch:[0/2](51500/63764) loss:3.690 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 10:54:10,152][model2_sft.py][INFO] Epoch:[0/2](51550/63764) loss:2.753 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 10:54:21,353][model2_sft.py][INFO] Epoch:[0/2](51600/63764) loss:3.223 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 10:54:32,520][model2_sft.py][INFO] Epoch:[0/2](51650/63764) loss:3.147 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 10:54:43,732][model2_sft.py][INFO] Epoch:[0/2](51700/63764) loss:3.415 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 10:54:54,878][model2_sft.py][INFO] Epoch:[0/2](51750/63764) loss:4.090 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 10:55:06,068][model2_sft.py][INFO] Epoch:[0/2](51800/63764) loss:3.305 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 10:55:17,248][model2_sft.py][INFO] Epoch:[0/2](51850/63764) loss:3.654 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 10:55:28,416][model2_sft.py][INFO] Epoch:[0/2](51900/63764) loss:3.456 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 10:55:39,584][model2_sft.py][INFO] Epoch:[0/2](51950/63764) loss:3.446 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 10:55:50,774][model2_sft.py][INFO] Epoch:[0/2](52000/63764) loss:3.339 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 10:56:01,938][model2_sft.py][INFO] Epoch:[0/2](52050/63764) loss:2.361 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 10:56:13,088][model2_sft.py][INFO] Epoch:[0/2](52100/63764) loss:2.994 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 10:56:24,243][model2_sft.py][INFO] Epoch:[0/2](52150/63764) loss:3.265 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 10:56:35,430][model2_sft.py][INFO] Epoch:[0/2](52200/63764) loss:3.195 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 10:56:46,639][model2_sft.py][INFO] Epoch:[0/2](52250/63764) loss:3.659 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 10:56:57,832][model2_sft.py][INFO] Epoch:[0/2](52300/63764) loss:3.087 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 10:57:09,026][model2_sft.py][INFO] Epoch:[0/2](52350/63764) loss:3.134 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 10:57:20,195][model2_sft.py][INFO] Epoch:[0/2](52400/63764) loss:3.101 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 10:57:31,399][model2_sft.py][INFO] Epoch:[0/2](52450/63764) loss:3.529 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 10:57:42,576][model2_sft.py][INFO] Epoch:[0/2](52500/63764) loss:2.567 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 10:57:53,820][model2_sft.py][INFO] Epoch:[0/2](52550/63764) loss:3.745 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 10:58:05,017][model2_sft.py][INFO] Epoch:[0/2](52600/63764) loss:2.943 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 10:58:16,212][model2_sft.py][INFO] Epoch:[0/2](52650/63764) loss:3.313 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 10:58:27,442][model2_sft.py][INFO] Epoch:[0/2](52700/63764) loss:3.427 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 10:58:38,625][model2_sft.py][INFO] Epoch:[0/2](52750/63764) loss:3.196 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 10:58:49,832][model2_sft.py][INFO] Epoch:[0/2](52800/63764) loss:3.398 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 10:59:01,027][model2_sft.py][INFO] Epoch:[0/2](52850/63764) loss:3.402 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 10:59:12,234][model2_sft.py][INFO] Epoch:[0/2](52900/63764) loss:3.614 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 10:59:23,436][model2_sft.py][INFO] Epoch:[0/2](52950/63764) loss:4.718 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 10:59:34,625][model2_sft.py][INFO] Epoch:[0/2](53000/63764) loss:3.856 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 10:59:45,810][model2_sft.py][INFO] Epoch:[0/2](53050/63764) loss:3.926 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 10:59:57,015][model2_sft.py][INFO] Epoch:[0/2](53100/63764) loss:3.540 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:00:08,194][model2_sft.py][INFO] Epoch:[0/2](53150/63764) loss:3.576 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:00:19,418][model2_sft.py][INFO] Epoch:[0/2](53200/63764) loss:3.184 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:00:30,621][model2_sft.py][INFO] Epoch:[0/2](53250/63764) loss:2.856 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:00:41,790][model2_sft.py][INFO] Epoch:[0/2](53300/63764) loss:3.227 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:00:52,949][model2_sft.py][INFO] Epoch:[0/2](53350/63764) loss:3.181 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:01:04,148][model2_sft.py][INFO] Epoch:[0/2](53400/63764) loss:4.172 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:01:15,317][model2_sft.py][INFO] Epoch:[0/2](53450/63764) loss:2.895 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 11:01:26,468][model2_sft.py][INFO] Epoch:[0/2](53500/63764) loss:3.652 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 11:01:37,686][model2_sft.py][INFO] Epoch:[0/2](53550/63764) loss:3.645 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 11:01:48,842][model2_sft.py][INFO] Epoch:[0/2](53600/63764) loss:3.850 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 11:01:59,988][model2_sft.py][INFO] Epoch:[0/2](53650/63764) loss:4.013 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 11:02:11,110][model2_sft.py][INFO] Epoch:[0/2](53700/63764) loss:3.417 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 11:02:22,260][model2_sft.py][INFO] Epoch:[0/2](53750/63764) loss:4.123 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 11:02:33,431][model2_sft.py][INFO] Epoch:[0/2](53800/63764) loss:3.691 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 11:02:44,598][model2_sft.py][INFO] Epoch:[0/2](53850/63764) loss:3.619 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 11:02:55,736][model2_sft.py][INFO] Epoch:[0/2](53900/63764) loss:3.467 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 11:03:06,910][model2_sft.py][INFO] Epoch:[0/2](53950/63764) loss:4.005 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 11:03:18,038][model2_sft.py][INFO] Epoch:[0/2](54000/63764) loss:3.341 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 11:03:29,230][model2_sft.py][INFO] Epoch:[0/2](54050/63764) loss:3.715 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 11:03:40,391][model2_sft.py][INFO] Epoch:[0/2](54100/63764) loss:3.224 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 11:03:51,584][model2_sft.py][INFO] Epoch:[0/2](54150/63764) loss:3.552 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 11:04:02,769][model2_sft.py][INFO] Epoch:[0/2](54200/63764) loss:2.977 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 11:04:13,962][model2_sft.py][INFO] Epoch:[0/2](54250/63764) loss:3.595 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 11:04:25,124][model2_sft.py][INFO] Epoch:[0/2](54300/63764) loss:3.776 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 11:04:36,282][model2_sft.py][INFO] Epoch:[0/2](54350/63764) loss:2.887 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 11:04:47,452][model2_sft.py][INFO] Epoch:[0/2](54400/63764) loss:3.264 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 11:04:58,616][model2_sft.py][INFO] Epoch:[0/2](54450/63764) loss:3.838 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 11:05:09,762][model2_sft.py][INFO] Epoch:[0/2](54500/63764) loss:3.057 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 11:05:20,964][model2_sft.py][INFO] Epoch:[0/2](54550/63764) loss:2.952 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 11:05:32,131][model2_sft.py][INFO] Epoch:[0/2](54600/63764) loss:3.370 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 11:05:43,325][model2_sft.py][INFO] Epoch:[0/2](54650/63764) loss:3.171 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 11:05:54,494][model2_sft.py][INFO] Epoch:[0/2](54700/63764) loss:3.607 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 11:06:05,661][model2_sft.py][INFO] Epoch:[0/2](54750/63764) loss:3.134 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 11:06:16,824][model2_sft.py][INFO] Epoch:[0/2](54800/63764) loss:2.959 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 11:06:27,973][model2_sft.py][INFO] Epoch:[0/2](54850/63764) loss:3.662 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 11:06:39,173][model2_sft.py][INFO] Epoch:[0/2](54900/63764) loss:3.926 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 11:06:50,314][model2_sft.py][INFO] Epoch:[0/2](54950/63764) loss:3.571 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 11:07:01,484][model2_sft.py][INFO] Epoch:[0/2](55000/63764) loss:4.046 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 11:07:12,745][model2_sft.py][INFO] Epoch:[0/2](55050/63764) loss:3.368 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 11:07:23,913][model2_sft.py][INFO] Epoch:[0/2](55100/63764) loss:3.273 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 11:07:35,044][model2_sft.py][INFO] Epoch:[0/2](55150/63764) loss:3.473 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 11:07:46,228][model2_sft.py][INFO] Epoch:[0/2](55200/63764) loss:3.592 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 11:07:57,348][model2_sft.py][INFO] Epoch:[0/2](55250/63764) loss:3.999 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 11:08:08,518][model2_sft.py][INFO] Epoch:[0/2](55300/63764) loss:3.378 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 11:08:19,657][model2_sft.py][INFO] Epoch:[0/2](55350/63764) loss:3.639 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 11:08:30,820][model2_sft.py][INFO] Epoch:[0/2](55400/63764) loss:3.748 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 11:08:41,985][model2_sft.py][INFO] Epoch:[0/2](55450/63764) loss:4.012 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 11:08:53,144][model2_sft.py][INFO] Epoch:[0/2](55500/63764) loss:3.962 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 11:09:04,303][model2_sft.py][INFO] Epoch:[0/2](55550/63764) loss:2.712 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 11:09:15,485][model2_sft.py][INFO] Epoch:[0/2](55600/63764) loss:3.155 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 11:09:26,630][model2_sft.py][INFO] Epoch:[0/2](55650/63764) loss:3.769 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 11:09:37,799][model2_sft.py][INFO] Epoch:[0/2](55700/63764) loss:3.699 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 11:09:48,966][model2_sft.py][INFO] Epoch:[0/2](55750/63764) loss:3.263 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 11:10:00,119][model2_sft.py][INFO] Epoch:[0/2](55800/63764) loss:3.256 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 11:10:11,298][model2_sft.py][INFO] Epoch:[0/2](55850/63764) loss:2.594 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 11:10:22,481][model2_sft.py][INFO] Epoch:[0/2](55900/63764) loss:3.847 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 11:10:33,584][model2_sft.py][INFO] Epoch:[0/2](55950/63764) loss:3.314 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 11:10:44,742][model2_sft.py][INFO] Epoch:[0/2](56000/63764) loss:3.696 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 11:10:55,888][model2_sft.py][INFO] Epoch:[0/2](56050/63764) loss:3.360 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 11:11:07,033][model2_sft.py][INFO] Epoch:[0/2](56100/63764) loss:3.867 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 11:11:18,224][model2_sft.py][INFO] Epoch:[0/2](56150/63764) loss:3.387 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 11:11:29,396][model2_sft.py][INFO] Epoch:[0/2](56200/63764) loss:3.337 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 11:11:40,525][model2_sft.py][INFO] Epoch:[0/2](56250/63764) loss:3.399 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 11:11:51,639][model2_sft.py][INFO] Epoch:[0/2](56300/63764) loss:2.998 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 11:12:02,794][model2_sft.py][INFO] Epoch:[0/2](56350/63764) loss:4.417 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 11:12:13,981][model2_sft.py][INFO] Epoch:[0/2](56400/63764) loss:3.600 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 11:12:25,106][model2_sft.py][INFO] Epoch:[0/2](56450/63764) loss:3.526 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 11:12:36,275][model2_sft.py][INFO] Epoch:[0/2](56500/63764) loss:3.705 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 11:12:47,479][model2_sft.py][INFO] Epoch:[0/2](56550/63764) loss:3.666 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 11:12:58,670][model2_sft.py][INFO] Epoch:[0/2](56600/63764) loss:3.425 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 11:13:09,806][model2_sft.py][INFO] Epoch:[0/2](56650/63764) loss:3.948 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 11:13:20,999][model2_sft.py][INFO] Epoch:[0/2](56700/63764) loss:3.076 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 11:13:32,145][model2_sft.py][INFO] Epoch:[0/2](56750/63764) loss:3.590 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 11:13:43,272][model2_sft.py][INFO] Epoch:[0/2](56800/63764) loss:3.227 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 11:13:54,424][model2_sft.py][INFO] Epoch:[0/2](56850/63764) loss:3.740 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 11:14:05,568][model2_sft.py][INFO] Epoch:[0/2](56900/63764) loss:3.121 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 11:14:16,774][model2_sft.py][INFO] Epoch:[0/2](56950/63764) loss:3.829 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 11:14:27,975][model2_sft.py][INFO] Epoch:[0/2](57000/63764) loss:3.437 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 11:14:39,130][model2_sft.py][INFO] Epoch:[0/2](57050/63764) loss:3.833 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 11:14:50,283][model2_sft.py][INFO] Epoch:[0/2](57100/63764) loss:2.922 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 11:15:01,430][model2_sft.py][INFO] Epoch:[0/2](57150/63764) loss:3.706 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 11:15:12,606][model2_sft.py][INFO] Epoch:[0/2](57200/63764) loss:3.219 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 11:15:23,757][model2_sft.py][INFO] Epoch:[0/2](57250/63764) loss:3.942 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 11:15:34,921][model2_sft.py][INFO] Epoch:[0/2](57300/63764) loss:2.876 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 11:15:46,058][model2_sft.py][INFO] Epoch:[0/2](57350/63764) loss:3.744 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 11:15:57,231][model2_sft.py][INFO] Epoch:[0/2](57400/63764) loss:3.218 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 11:16:08,408][model2_sft.py][INFO] Epoch:[0/2](57450/63764) loss:2.890 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 11:16:19,538][model2_sft.py][INFO] Epoch:[0/2](57500/63764) loss:3.450 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 11:16:30,722][model2_sft.py][INFO] Epoch:[0/2](57550/63764) loss:3.381 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 11:16:41,911][model2_sft.py][INFO] Epoch:[0/2](57600/63764) loss:3.598 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 11:16:53,085][model2_sft.py][INFO] Epoch:[0/2](57650/63764) loss:3.497 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 11:17:04,296][model2_sft.py][INFO] Epoch:[0/2](57700/63764) loss:2.808 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 11:17:15,509][model2_sft.py][INFO] Epoch:[0/2](57750/63764) loss:3.320 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 11:17:26,667][model2_sft.py][INFO] Epoch:[0/2](57800/63764) loss:3.983 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 11:17:37,834][model2_sft.py][INFO] Epoch:[0/2](57850/63764) loss:3.300 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 11:17:49,035][model2_sft.py][INFO] Epoch:[0/2](57900/63764) loss:3.109 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 11:18:00,171][model2_sft.py][INFO] Epoch:[0/2](57950/63764) loss:3.028 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 11:18:11,313][model2_sft.py][INFO] Epoch:[0/2](58000/63764) loss:3.077 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 11:18:22,451][model2_sft.py][INFO] Epoch:[0/2](58050/63764) loss:3.834 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 11:18:33,640][model2_sft.py][INFO] Epoch:[0/2](58100/63764) loss:3.511 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 11:18:44,757][model2_sft.py][INFO] Epoch:[0/2](58150/63764) loss:3.666 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 11:18:55,981][model2_sft.py][INFO] Epoch:[0/2](58200/63764) loss:3.716 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 11:19:07,101][model2_sft.py][INFO] Epoch:[0/2](58250/63764) loss:3.375 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 11:19:18,302][model2_sft.py][INFO] Epoch:[0/2](58300/63764) loss:2.922 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 11:19:29,442][model2_sft.py][INFO] Epoch:[0/2](58350/63764) loss:3.830 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 11:19:40,591][model2_sft.py][INFO] Epoch:[0/2](58400/63764) loss:3.032 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 11:19:51,737][model2_sft.py][INFO] Epoch:[0/2](58450/63764) loss:2.989 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 11:20:02,941][model2_sft.py][INFO] Epoch:[0/2](58500/63764) loss:3.273 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 11:20:14,090][model2_sft.py][INFO] Epoch:[0/2](58550/63764) loss:3.988 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 11:20:25,282][model2_sft.py][INFO] Epoch:[0/2](58600/63764) loss:3.363 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 11:20:36,466][model2_sft.py][INFO] Epoch:[0/2](58650/63764) loss:3.474 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 11:20:47,617][model2_sft.py][INFO] Epoch:[0/2](58700/63764) loss:3.167 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 11:20:58,818][model2_sft.py][INFO] Epoch:[0/2](58750/63764) loss:3.662 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 11:21:10,005][model2_sft.py][INFO] Epoch:[0/2](58800/63764) loss:3.545 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 11:21:21,159][model2_sft.py][INFO] Epoch:[0/2](58850/63764) loss:3.335 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 11:21:32,329][model2_sft.py][INFO] Epoch:[0/2](58900/63764) loss:3.461 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 11:21:43,490][model2_sft.py][INFO] Epoch:[0/2](58950/63764) loss:3.332 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 11:21:54,664][model2_sft.py][INFO] Epoch:[0/2](59000/63764) loss:3.266 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 11:22:05,861][model2_sft.py][INFO] Epoch:[0/2](59050/63764) loss:3.023 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 11:22:17,030][model2_sft.py][INFO] Epoch:[0/2](59100/63764) loss:3.656 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 11:22:28,225][model2_sft.py][INFO] Epoch:[0/2](59150/63764) loss:3.375 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 11:22:39,344][model2_sft.py][INFO] Epoch:[0/2](59200/63764) loss:3.715 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 11:22:50,488][model2_sft.py][INFO] Epoch:[0/2](59250/63764) loss:2.951 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 11:23:01,641][model2_sft.py][INFO] Epoch:[0/2](59300/63764) loss:2.779 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 11:23:12,813][model2_sft.py][INFO] Epoch:[0/2](59350/63764) loss:3.540 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 11:23:24,009][model2_sft.py][INFO] Epoch:[0/2](59400/63764) loss:2.860 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 11:23:35,153][model2_sft.py][INFO] Epoch:[0/2](59450/63764) loss:2.889 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 11:23:46,281][model2_sft.py][INFO] Epoch:[0/2](59500/63764) loss:3.388 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 11:23:57,425][model2_sft.py][INFO] Epoch:[0/2](59550/63764) loss:3.506 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 11:24:08,638][model2_sft.py][INFO] Epoch:[0/2](59600/63764) loss:3.689 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 11:24:19,785][model2_sft.py][INFO] Epoch:[0/2](59650/63764) loss:3.090 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 11:24:30,943][model2_sft.py][INFO] Epoch:[0/2](59700/63764) loss:4.020 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 11:24:42,094][model2_sft.py][INFO] Epoch:[0/2](59750/63764) loss:3.151 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 11:24:53,223][model2_sft.py][INFO] Epoch:[0/2](59800/63764) loss:2.493 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 11:25:04,410][model2_sft.py][INFO] Epoch:[0/2](59850/63764) loss:3.750 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 11:25:15,558][model2_sft.py][INFO] Epoch:[0/2](59900/63764) loss:3.050 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 11:25:26,699][model2_sft.py][INFO] Epoch:[0/2](59950/63764) loss:3.700 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 11:25:37,867][model2_sft.py][INFO] Epoch:[0/2](60000/63764) loss:3.231 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 11:25:49,027][model2_sft.py][INFO] Epoch:[0/2](60050/63764) loss:3.755 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 11:26:00,148][model2_sft.py][INFO] Epoch:[0/2](60100/63764) loss:3.330 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 11:26:11,295][model2_sft.py][INFO] Epoch:[0/2](60150/63764) loss:3.816 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 11:26:22,397][model2_sft.py][INFO] Epoch:[0/2](60200/63764) loss:3.498 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 11:26:33,588][model2_sft.py][INFO] Epoch:[0/2](60250/63764) loss:3.477 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 11:26:44,732][model2_sft.py][INFO] Epoch:[0/2](60300/63764) loss:3.316 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 11:26:55,861][model2_sft.py][INFO] Epoch:[0/2](60350/63764) loss:3.616 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 11:27:07,029][model2_sft.py][INFO] Epoch:[0/2](60400/63764) loss:3.245 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 11:27:18,166][model2_sft.py][INFO] Epoch:[0/2](60450/63764) loss:3.561 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 11:27:29,312][model2_sft.py][INFO] Epoch:[0/2](60500/63764) loss:3.838 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 11:27:40,435][model2_sft.py][INFO] Epoch:[0/2](60550/63764) loss:2.912 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 11:27:51,629][model2_sft.py][INFO] Epoch:[0/2](60600/63764) loss:3.860 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 11:28:02,797][model2_sft.py][INFO] Epoch:[0/2](60650/63764) loss:3.165 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 11:28:13,998][model2_sft.py][INFO] Epoch:[0/2](60700/63764) loss:3.283 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 11:28:25,166][model2_sft.py][INFO] Epoch:[0/2](60750/63764) loss:3.221 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 11:28:36,315][model2_sft.py][INFO] Epoch:[0/2](60800/63764) loss:3.188 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 11:28:47,502][model2_sft.py][INFO] Epoch:[0/2](60850/63764) loss:3.811 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 11:28:58,648][model2_sft.py][INFO] Epoch:[0/2](60900/63764) loss:3.366 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 11:29:09,826][model2_sft.py][INFO] Epoch:[0/2](60950/63764) loss:2.971 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 11:29:20,994][model2_sft.py][INFO] Epoch:[0/2](61000/63764) loss:3.228 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 11:29:32,122][model2_sft.py][INFO] Epoch:[0/2](61050/63764) loss:3.448 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 11:29:43,238][model2_sft.py][INFO] Epoch:[0/2](61100/63764) loss:4.097 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 11:29:54,334][model2_sft.py][INFO] Epoch:[0/2](61150/63764) loss:3.202 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 11:30:05,500][model2_sft.py][INFO] Epoch:[0/2](61200/63764) loss:3.961 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 11:30:16,673][model2_sft.py][INFO] Epoch:[0/2](61250/63764) loss:3.011 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 11:30:27,791][model2_sft.py][INFO] Epoch:[0/2](61300/63764) loss:3.655 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 11:30:38,968][model2_sft.py][INFO] Epoch:[0/2](61350/63764) loss:4.158 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 11:30:50,116][model2_sft.py][INFO] Epoch:[0/2](61400/63764) loss:4.361 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 11:31:01,256][model2_sft.py][INFO] Epoch:[0/2](61450/63764) loss:3.655 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 11:31:12,386][model2_sft.py][INFO] Epoch:[0/2](61500/63764) loss:3.294 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 11:31:23,560][model2_sft.py][INFO] Epoch:[0/2](61550/63764) loss:3.308 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 11:31:34,682][model2_sft.py][INFO] Epoch:[0/2](61600/63764) loss:3.604 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 11:31:45,870][model2_sft.py][INFO] Epoch:[0/2](61650/63764) loss:2.876 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 11:31:57,038][model2_sft.py][INFO] Epoch:[0/2](61700/63764) loss:2.769 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 11:32:08,200][model2_sft.py][INFO] Epoch:[0/2](61750/63764) loss:3.483 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 11:32:19,318][model2_sft.py][INFO] Epoch:[0/2](61800/63764) loss:3.383 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 11:32:30,467][model2_sft.py][INFO] Epoch:[0/2](61850/63764) loss:3.295 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 11:32:41,667][model2_sft.py][INFO] Epoch:[0/2](61900/63764) loss:3.289 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 11:32:52,838][model2_sft.py][INFO] Epoch:[0/2](61950/63764) loss:3.598 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 11:33:03,984][model2_sft.py][INFO] Epoch:[0/2](62000/63764) loss:2.901 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 11:33:15,126][model2_sft.py][INFO] Epoch:[0/2](62050/63764) loss:3.779 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 11:33:26,283][model2_sft.py][INFO] Epoch:[0/2](62100/63764) loss:3.208 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 11:33:37,420][model2_sft.py][INFO] Epoch:[0/2](62150/63764) loss:3.012 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 11:33:48,586][model2_sft.py][INFO] Epoch:[0/2](62200/63764) loss:3.292 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 11:33:59,822][model2_sft.py][INFO] Epoch:[0/2](62250/63764) loss:3.587 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 11:34:10,963][model2_sft.py][INFO] Epoch:[0/2](62300/63764) loss:3.161 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 11:34:22,129][model2_sft.py][INFO] Epoch:[0/2](62350/63764) loss:2.439 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 11:34:33,295][model2_sft.py][INFO] Epoch:[0/2](62400/63764) loss:3.530 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 11:34:44,482][model2_sft.py][INFO] Epoch:[0/2](62450/63764) loss:3.665 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 11:34:55,685][model2_sft.py][INFO] Epoch:[0/2](62500/63764) loss:2.945 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 11:35:06,801][model2_sft.py][INFO] Epoch:[0/2](62550/63764) loss:3.092 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 11:35:17,938][model2_sft.py][INFO] Epoch:[0/2](62600/63764) loss:3.916 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 11:35:29,113][model2_sft.py][INFO] Epoch:[0/2](62650/63764) loss:2.939 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 11:35:40,238][model2_sft.py][INFO] Epoch:[0/2](62700/63764) loss:3.599 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 11:35:51,366][model2_sft.py][INFO] Epoch:[0/2](62750/63764) loss:3.679 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 11:36:02,476][model2_sft.py][INFO] Epoch:[0/2](62800/63764) loss:3.882 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 11:36:13,633][model2_sft.py][INFO] Epoch:[0/2](62850/63764) loss:4.024 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 11:36:24,744][model2_sft.py][INFO] Epoch:[0/2](62900/63764) loss:3.752 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 11:36:35,887][model2_sft.py][INFO] Epoch:[0/2](62950/63764) loss:3.426 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 11:36:47,013][model2_sft.py][INFO] Epoch:[0/2](63000/63764) loss:4.098 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 11:36:58,180][model2_sft.py][INFO] Epoch:[0/2](63050/63764) loss:2.622 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 11:37:09,329][model2_sft.py][INFO] Epoch:[0/2](63100/63764) loss:3.304 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 11:37:20,473][model2_sft.py][INFO] Epoch:[0/2](63150/63764) loss:3.201 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 11:37:31,630][model2_sft.py][INFO] Epoch:[0/2](63200/63764) loss:2.838 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 11:37:42,790][model2_sft.py][INFO] Epoch:[0/2](63250/63764) loss:3.425 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 11:37:53,945][model2_sft.py][INFO] Epoch:[0/2](63300/63764) loss:3.334 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 11:38:05,140][model2_sft.py][INFO] Epoch:[0/2](63350/63764) loss:3.488 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 11:38:16,273][model2_sft.py][INFO] Epoch:[0/2](63400/63764) loss:3.584 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 11:38:27,389][model2_sft.py][INFO] Epoch:[0/2](63450/63764) loss:3.064 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 11:38:38,563][model2_sft.py][INFO] Epoch:[0/2](63500/63764) loss:3.080 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 11:38:49,668][model2_sft.py][INFO] Epoch:[0/2](63550/63764) loss:3.089 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 11:39:00,788][model2_sft.py][INFO] Epoch:[0/2](63600/63764) loss:3.229 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 11:39:11,959][model2_sft.py][INFO] Epoch:[0/2](63650/63764) loss:3.053 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 11:39:23,125][model2_sft.py][INFO] Epoch:[0/2](63700/63764) loss:3.732 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 11:39:34,315][model2_sft.py][INFO] Epoch:[0/2](63750/63764) loss:3.521 lr:0.0000010 epoch_Time:0.0min: [2023-12-11 11:39:38,642][model2_sft.py][INFO] Epoch:[1/2](0/63764) loss:3.111 lr:0.0000010 epoch_Time:239.0min: [2023-12-11 11:39:49,789][model2_sft.py][INFO] Epoch:[1/2](50/63764) loss:3.514 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 11:40:00,973][model2_sft.py][INFO] Epoch:[1/2](100/63764) loss:3.688 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 11:40:12,127][model2_sft.py][INFO] Epoch:[1/2](150/63764) loss:3.699 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 11:40:23,249][model2_sft.py][INFO] Epoch:[1/2](200/63764) loss:2.677 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 11:40:34,438][model2_sft.py][INFO] Epoch:[1/2](250/63764) loss:3.339 lr:0.0000010 epoch_Time:237.0min: [2023-12-11 11:40:45,582][model2_sft.py][INFO] Epoch:[1/2](300/63764) loss:3.748 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 11:40:56,741][model2_sft.py][INFO] Epoch:[1/2](350/63764) loss:3.313 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 11:41:07,882][model2_sft.py][INFO] Epoch:[1/2](400/63764) loss:2.625 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 11:41:19,030][model2_sft.py][INFO] Epoch:[1/2](450/63764) loss:2.880 lr:0.0000010 epoch_Time:236.0min: [2023-12-11 11:41:30,141][model2_sft.py][INFO] Epoch:[1/2](500/63764) loss:3.413 lr:0.0000010 epoch_Time:235.0min: [2023-12-11 11:41:41,325][model2_sft.py][INFO] Epoch:[1/2](550/63764) loss:3.627 lr:0.0000010 epoch_Time:235.0min: [2023-12-11 11:41:52,455][model2_sft.py][INFO] Epoch:[1/2](600/63764) loss:3.286 lr:0.0000010 epoch_Time:235.0min: [2023-12-11 11:42:03,572][model2_sft.py][INFO] Epoch:[1/2](650/63764) loss:3.453 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 11:42:14,674][model2_sft.py][INFO] Epoch:[1/2](700/63764) loss:3.412 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 11:42:25,876][model2_sft.py][INFO] Epoch:[1/2](750/63764) loss:2.979 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 11:42:37,021][model2_sft.py][INFO] Epoch:[1/2](800/63764) loss:3.122 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 11:42:48,189][model2_sft.py][INFO] Epoch:[1/2](850/63764) loss:3.085 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:42:59,313][model2_sft.py][INFO] Epoch:[1/2](900/63764) loss:3.678 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:43:10,482][model2_sft.py][INFO] Epoch:[1/2](950/63764) loss:3.786 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:43:21,652][model2_sft.py][INFO] Epoch:[1/2](1000/63764) loss:3.372 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 11:43:32,836][model2_sft.py][INFO] Epoch:[1/2](1050/63764) loss:2.755 lr:0.0000010 epoch_Time:234.0min: [2023-12-11 11:43:44,114][model2_sft.py][INFO] Epoch:[1/2](1100/63764) loss:3.722 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:43:55,309][model2_sft.py][INFO] Epoch:[1/2](1150/63764) loss:3.151 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:44:06,431][model2_sft.py][INFO] Epoch:[1/2](1200/63764) loss:3.042 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:44:17,572][model2_sft.py][INFO] Epoch:[1/2](1250/63764) loss:3.378 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:44:28,701][model2_sft.py][INFO] Epoch:[1/2](1300/63764) loss:2.993 lr:0.0000010 epoch_Time:233.0min: [2023-12-11 11:44:39,864][model2_sft.py][INFO] Epoch:[1/2](1350/63764) loss:3.447 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 11:44:51,093][model2_sft.py][INFO] Epoch:[1/2](1400/63764) loss:3.715 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 11:45:02,241][model2_sft.py][INFO] Epoch:[1/2](1450/63764) loss:3.393 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 11:45:13,396][model2_sft.py][INFO] Epoch:[1/2](1500/63764) loss:3.549 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 11:45:24,504][model2_sft.py][INFO] Epoch:[1/2](1550/63764) loss:3.237 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 11:45:35,670][model2_sft.py][INFO] Epoch:[1/2](1600/63764) loss:2.848 lr:0.0000010 epoch_Time:232.0min: [2023-12-11 11:45:46,832][model2_sft.py][INFO] Epoch:[1/2](1650/63764) loss:2.931 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 11:45:57,973][model2_sft.py][INFO] Epoch:[1/2](1700/63764) loss:2.681 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 11:46:09,133][model2_sft.py][INFO] Epoch:[1/2](1750/63764) loss:3.526 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 11:46:20,283][model2_sft.py][INFO] Epoch:[1/2](1800/63764) loss:3.349 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 11:46:31,454][model2_sft.py][INFO] Epoch:[1/2](1850/63764) loss:3.544 lr:0.0000010 epoch_Time:231.0min: [2023-12-11 11:46:42,627][model2_sft.py][INFO] Epoch:[1/2](1900/63764) loss:3.922 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 11:46:53,788][model2_sft.py][INFO] Epoch:[1/2](1950/63764) loss:2.898 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 11:47:04,913][model2_sft.py][INFO] Epoch:[1/2](2000/63764) loss:2.735 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 11:47:16,065][model2_sft.py][INFO] Epoch:[1/2](2050/63764) loss:3.343 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 11:47:27,200][model2_sft.py][INFO] Epoch:[1/2](2100/63764) loss:3.792 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 11:47:38,349][model2_sft.py][INFO] Epoch:[1/2](2150/63764) loss:3.317 lr:0.0000010 epoch_Time:230.0min: [2023-12-11 11:47:49,499][model2_sft.py][INFO] Epoch:[1/2](2200/63764) loss:3.858 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 11:48:00,637][model2_sft.py][INFO] Epoch:[1/2](2250/63764) loss:2.780 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 11:48:11,760][model2_sft.py][INFO] Epoch:[1/2](2300/63764) loss:3.228 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 11:48:22,909][model2_sft.py][INFO] Epoch:[1/2](2350/63764) loss:4.121 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 11:48:34,065][model2_sft.py][INFO] Epoch:[1/2](2400/63764) loss:3.224 lr:0.0000010 epoch_Time:229.0min: [2023-12-11 11:48:45,186][model2_sft.py][INFO] Epoch:[1/2](2450/63764) loss:3.369 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 11:48:56,306][model2_sft.py][INFO] Epoch:[1/2](2500/63764) loss:3.541 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 11:49:07,447][model2_sft.py][INFO] Epoch:[1/2](2550/63764) loss:2.446 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 11:49:18,582][model2_sft.py][INFO] Epoch:[1/2](2600/63764) loss:3.538 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 11:49:29,714][model2_sft.py][INFO] Epoch:[1/2](2650/63764) loss:3.275 lr:0.0000010 epoch_Time:228.0min: [2023-12-11 11:49:40,859][model2_sft.py][INFO] Epoch:[1/2](2700/63764) loss:3.574 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 11:49:51,966][model2_sft.py][INFO] Epoch:[1/2](2750/63764) loss:3.603 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 11:50:03,099][model2_sft.py][INFO] Epoch:[1/2](2800/63764) loss:2.848 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 11:50:14,276][model2_sft.py][INFO] Epoch:[1/2](2850/63764) loss:2.668 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 11:50:25,446][model2_sft.py][INFO] Epoch:[1/2](2900/63764) loss:3.655 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 11:50:36,603][model2_sft.py][INFO] Epoch:[1/2](2950/63764) loss:3.682 lr:0.0000010 epoch_Time:227.0min: [2023-12-11 11:50:47,783][model2_sft.py][INFO] Epoch:[1/2](3000/63764) loss:3.274 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 11:50:58,922][model2_sft.py][INFO] Epoch:[1/2](3050/63764) loss:3.541 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 11:51:10,026][model2_sft.py][INFO] Epoch:[1/2](3100/63764) loss:3.093 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 11:51:21,183][model2_sft.py][INFO] Epoch:[1/2](3150/63764) loss:3.349 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 11:51:32,302][model2_sft.py][INFO] Epoch:[1/2](3200/63764) loss:2.962 lr:0.0000010 epoch_Time:226.0min: [2023-12-11 11:51:43,448][model2_sft.py][INFO] Epoch:[1/2](3250/63764) loss:3.708 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 11:51:54,587][model2_sft.py][INFO] Epoch:[1/2](3300/63764) loss:2.884 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 11:52:05,745][model2_sft.py][INFO] Epoch:[1/2](3350/63764) loss:4.060 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 11:52:16,964][model2_sft.py][INFO] Epoch:[1/2](3400/63764) loss:3.167 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 11:52:28,114][model2_sft.py][INFO] Epoch:[1/2](3450/63764) loss:3.437 lr:0.0000010 epoch_Time:225.0min: [2023-12-11 11:52:39,292][model2_sft.py][INFO] Epoch:[1/2](3500/63764) loss:3.311 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 11:52:50,420][model2_sft.py][INFO] Epoch:[1/2](3550/63764) loss:3.109 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 11:53:01,544][model2_sft.py][INFO] Epoch:[1/2](3600/63764) loss:3.171 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 11:53:12,743][model2_sft.py][INFO] Epoch:[1/2](3650/63764) loss:3.108 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 11:53:23,921][model2_sft.py][INFO] Epoch:[1/2](3700/63764) loss:2.901 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 11:53:35,058][model2_sft.py][INFO] Epoch:[1/2](3750/63764) loss:3.928 lr:0.0000010 epoch_Time:224.0min: [2023-12-11 11:53:46,243][model2_sft.py][INFO] Epoch:[1/2](3800/63764) loss:3.295 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 11:53:57,447][model2_sft.py][INFO] Epoch:[1/2](3850/63764) loss:3.408 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 11:54:08,594][model2_sft.py][INFO] Epoch:[1/2](3900/63764) loss:3.424 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 11:54:19,721][model2_sft.py][INFO] Epoch:[1/2](3950/63764) loss:2.955 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 11:54:30,829][model2_sft.py][INFO] Epoch:[1/2](4000/63764) loss:3.251 lr:0.0000010 epoch_Time:223.0min: [2023-12-11 11:54:42,002][model2_sft.py][INFO] Epoch:[1/2](4050/63764) loss:3.862 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 11:54:53,165][model2_sft.py][INFO] Epoch:[1/2](4100/63764) loss:3.462 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 11:55:04,349][model2_sft.py][INFO] Epoch:[1/2](4150/63764) loss:3.764 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 11:55:15,512][model2_sft.py][INFO] Epoch:[1/2](4200/63764) loss:3.609 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 11:55:26,669][model2_sft.py][INFO] Epoch:[1/2](4250/63764) loss:3.715 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 11:55:37,808][model2_sft.py][INFO] Epoch:[1/2](4300/63764) loss:3.449 lr:0.0000010 epoch_Time:222.0min: [2023-12-11 11:55:48,999][model2_sft.py][INFO] Epoch:[1/2](4350/63764) loss:4.179 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 11:56:00,145][model2_sft.py][INFO] Epoch:[1/2](4400/63764) loss:3.369 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 11:56:11,271][model2_sft.py][INFO] Epoch:[1/2](4450/63764) loss:3.283 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 11:56:22,440][model2_sft.py][INFO] Epoch:[1/2](4500/63764) loss:2.798 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 11:56:33,551][model2_sft.py][INFO] Epoch:[1/2](4550/63764) loss:3.647 lr:0.0000010 epoch_Time:221.0min: [2023-12-11 11:56:44,739][model2_sft.py][INFO] Epoch:[1/2](4600/63764) loss:2.881 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 11:56:55,928][model2_sft.py][INFO] Epoch:[1/2](4650/63764) loss:3.225 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 11:57:07,090][model2_sft.py][INFO] Epoch:[1/2](4700/63764) loss:3.234 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 11:57:18,281][model2_sft.py][INFO] Epoch:[1/2](4750/63764) loss:2.988 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 11:57:29,425][model2_sft.py][INFO] Epoch:[1/2](4800/63764) loss:2.817 lr:0.0000010 epoch_Time:220.0min: [2023-12-11 11:57:40,666][model2_sft.py][INFO] Epoch:[1/2](4850/63764) loss:3.091 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 11:57:51,813][model2_sft.py][INFO] Epoch:[1/2](4900/63764) loss:3.536 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 11:58:02,971][model2_sft.py][INFO] Epoch:[1/2](4950/63764) loss:3.774 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 11:58:14,149][model2_sft.py][INFO] Epoch:[1/2](5000/63764) loss:3.811 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 11:58:25,381][model2_sft.py][INFO] Epoch:[1/2](5050/63764) loss:3.194 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 11:58:36,501][model2_sft.py][INFO] Epoch:[1/2](5100/63764) loss:3.758 lr:0.0000010 epoch_Time:219.0min: [2023-12-11 11:58:47,675][model2_sft.py][INFO] Epoch:[1/2](5150/63764) loss:3.357 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 11:58:58,801][model2_sft.py][INFO] Epoch:[1/2](5200/63764) loss:3.277 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 11:59:09,962][model2_sft.py][INFO] Epoch:[1/2](5250/63764) loss:2.994 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 11:59:21,149][model2_sft.py][INFO] Epoch:[1/2](5300/63764) loss:2.614 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 11:59:32,338][model2_sft.py][INFO] Epoch:[1/2](5350/63764) loss:3.467 lr:0.0000010 epoch_Time:218.0min: [2023-12-11 11:59:43,464][model2_sft.py][INFO] Epoch:[1/2](5400/63764) loss:3.451 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 11:59:54,599][model2_sft.py][INFO] Epoch:[1/2](5450/63764) loss:3.432 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 12:00:05,742][model2_sft.py][INFO] Epoch:[1/2](5500/63764) loss:3.586 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 12:00:16,906][model2_sft.py][INFO] Epoch:[1/2](5550/63764) loss:2.739 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 12:00:28,095][model2_sft.py][INFO] Epoch:[1/2](5600/63764) loss:3.163 lr:0.0000010 epoch_Time:217.0min: [2023-12-11 12:00:39,282][model2_sft.py][INFO] Epoch:[1/2](5650/63764) loss:3.628 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 12:00:50,431][model2_sft.py][INFO] Epoch:[1/2](5700/63764) loss:2.848 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 12:01:01,551][model2_sft.py][INFO] Epoch:[1/2](5750/63764) loss:3.595 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 12:01:12,719][model2_sft.py][INFO] Epoch:[1/2](5800/63764) loss:3.440 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 12:01:23,883][model2_sft.py][INFO] Epoch:[1/2](5850/63764) loss:3.546 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 12:01:35,000][model2_sft.py][INFO] Epoch:[1/2](5900/63764) loss:2.306 lr:0.0000010 epoch_Time:216.0min: [2023-12-11 12:01:46,176][model2_sft.py][INFO] Epoch:[1/2](5950/63764) loss:2.680 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 12:01:57,374][model2_sft.py][INFO] Epoch:[1/2](6000/63764) loss:3.010 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 12:02:08,549][model2_sft.py][INFO] Epoch:[1/2](6050/63764) loss:3.142 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 12:02:19,703][model2_sft.py][INFO] Epoch:[1/2](6100/63764) loss:3.063 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 12:02:30,882][model2_sft.py][INFO] Epoch:[1/2](6150/63764) loss:3.136 lr:0.0000010 epoch_Time:215.0min: [2023-12-11 12:02:42,058][model2_sft.py][INFO] Epoch:[1/2](6200/63764) loss:2.756 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 12:02:53,202][model2_sft.py][INFO] Epoch:[1/2](6250/63764) loss:3.456 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 12:03:04,375][model2_sft.py][INFO] Epoch:[1/2](6300/63764) loss:3.177 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 12:03:15,522][model2_sft.py][INFO] Epoch:[1/2](6350/63764) loss:3.509 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 12:03:26,715][model2_sft.py][INFO] Epoch:[1/2](6400/63764) loss:3.212 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 12:03:37,882][model2_sft.py][INFO] Epoch:[1/2](6450/63764) loss:3.146 lr:0.0000010 epoch_Time:214.0min: [2023-12-11 12:03:49,048][model2_sft.py][INFO] Epoch:[1/2](6500/63764) loss:2.664 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 12:04:00,273][model2_sft.py][INFO] Epoch:[1/2](6550/63764) loss:3.382 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 12:04:11,422][model2_sft.py][INFO] Epoch:[1/2](6600/63764) loss:3.731 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 12:04:22,558][model2_sft.py][INFO] Epoch:[1/2](6650/63764) loss:2.826 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 12:04:33,717][model2_sft.py][INFO] Epoch:[1/2](6700/63764) loss:3.019 lr:0.0000010 epoch_Time:213.0min: [2023-12-11 12:04:44,859][model2_sft.py][INFO] Epoch:[1/2](6750/63764) loss:3.353 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 12:04:56,086][model2_sft.py][INFO] Epoch:[1/2](6800/63764) loss:3.830 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 12:05:07,214][model2_sft.py][INFO] Epoch:[1/2](6850/63764) loss:3.608 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 12:05:18,357][model2_sft.py][INFO] Epoch:[1/2](6900/63764) loss:3.455 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 12:05:29,509][model2_sft.py][INFO] Epoch:[1/2](6950/63764) loss:3.314 lr:0.0000010 epoch_Time:212.0min: [2023-12-11 12:05:40,660][model2_sft.py][INFO] Epoch:[1/2](7000/63764) loss:2.623 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 12:05:51,827][model2_sft.py][INFO] Epoch:[1/2](7050/63764) loss:3.497 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 12:06:02,963][model2_sft.py][INFO] Epoch:[1/2](7100/63764) loss:3.462 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 12:06:14,162][model2_sft.py][INFO] Epoch:[1/2](7150/63764) loss:3.019 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 12:06:25,363][model2_sft.py][INFO] Epoch:[1/2](7200/63764) loss:3.187 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 12:06:36,504][model2_sft.py][INFO] Epoch:[1/2](7250/63764) loss:2.767 lr:0.0000010 epoch_Time:211.0min: [2023-12-11 12:06:47,637][model2_sft.py][INFO] Epoch:[1/2](7300/63764) loss:2.284 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 12:06:58,772][model2_sft.py][INFO] Epoch:[1/2](7350/63764) loss:3.536 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 12:07:09,922][model2_sft.py][INFO] Epoch:[1/2](7400/63764) loss:2.718 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 12:07:21,039][model2_sft.py][INFO] Epoch:[1/2](7450/63764) loss:3.528 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 12:07:32,229][model2_sft.py][INFO] Epoch:[1/2](7500/63764) loss:3.007 lr:0.0000010 epoch_Time:210.0min: [2023-12-11 12:07:43,377][model2_sft.py][INFO] Epoch:[1/2](7550/63764) loss:2.889 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 12:07:54,522][model2_sft.py][INFO] Epoch:[1/2](7600/63764) loss:3.058 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 12:08:05,635][model2_sft.py][INFO] Epoch:[1/2](7650/63764) loss:2.880 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 12:08:16,829][model2_sft.py][INFO] Epoch:[1/2](7700/63764) loss:2.736 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 12:08:27,960][model2_sft.py][INFO] Epoch:[1/2](7750/63764) loss:3.044 lr:0.0000010 epoch_Time:209.0min: [2023-12-11 12:08:39,113][model2_sft.py][INFO] Epoch:[1/2](7800/63764) loss:3.830 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 12:08:50,274][model2_sft.py][INFO] Epoch:[1/2](7850/63764) loss:2.539 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 12:09:01,368][model2_sft.py][INFO] Epoch:[1/2](7900/63764) loss:3.183 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 12:09:12,547][model2_sft.py][INFO] Epoch:[1/2](7950/63764) loss:2.789 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 12:09:23,712][model2_sft.py][INFO] Epoch:[1/2](8000/63764) loss:3.574 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 12:09:34,838][model2_sft.py][INFO] Epoch:[1/2](8050/63764) loss:2.977 lr:0.0000010 epoch_Time:208.0min: [2023-12-11 12:09:46,014][model2_sft.py][INFO] Epoch:[1/2](8100/63764) loss:3.445 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 12:09:57,144][model2_sft.py][INFO] Epoch:[1/2](8150/63764) loss:2.582 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 12:10:08,300][model2_sft.py][INFO] Epoch:[1/2](8200/63764) loss:3.560 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 12:10:19,473][model2_sft.py][INFO] Epoch:[1/2](8250/63764) loss:3.017 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 12:10:30,652][model2_sft.py][INFO] Epoch:[1/2](8300/63764) loss:2.536 lr:0.0000010 epoch_Time:207.0min: [2023-12-11 12:10:41,775][model2_sft.py][INFO] Epoch:[1/2](8350/63764) loss:4.132 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 12:10:52,925][model2_sft.py][INFO] Epoch:[1/2](8400/63764) loss:3.424 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 12:11:04,103][model2_sft.py][INFO] Epoch:[1/2](8450/63764) loss:3.782 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 12:11:15,235][model2_sft.py][INFO] Epoch:[1/2](8500/63764) loss:3.319 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 12:11:26,438][model2_sft.py][INFO] Epoch:[1/2](8550/63764) loss:2.879 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 12:11:37,592][model2_sft.py][INFO] Epoch:[1/2](8600/63764) loss:2.962 lr:0.0000010 epoch_Time:206.0min: [2023-12-11 12:11:48,761][model2_sft.py][INFO] Epoch:[1/2](8650/63764) loss:3.718 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 12:11:59,900][model2_sft.py][INFO] Epoch:[1/2](8700/63764) loss:3.387 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 12:12:11,034][model2_sft.py][INFO] Epoch:[1/2](8750/63764) loss:2.833 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 12:12:22,198][model2_sft.py][INFO] Epoch:[1/2](8800/63764) loss:3.093 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 12:12:33,327][model2_sft.py][INFO] Epoch:[1/2](8850/63764) loss:3.465 lr:0.0000010 epoch_Time:205.0min: [2023-12-11 12:12:44,473][model2_sft.py][INFO] Epoch:[1/2](8900/63764) loss:3.623 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 12:12:55,652][model2_sft.py][INFO] Epoch:[1/2](8950/63764) loss:3.323 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 12:13:06,812][model2_sft.py][INFO] Epoch:[1/2](9000/63764) loss:2.830 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 12:13:17,933][model2_sft.py][INFO] Epoch:[1/2](9050/63764) loss:2.916 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 12:13:29,046][model2_sft.py][INFO] Epoch:[1/2](9100/63764) loss:2.968 lr:0.0000010 epoch_Time:204.0min: [2023-12-11 12:13:40,174][model2_sft.py][INFO] Epoch:[1/2](9150/63764) loss:3.457 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 12:13:51,328][model2_sft.py][INFO] Epoch:[1/2](9200/63764) loss:3.688 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 12:14:02,512][model2_sft.py][INFO] Epoch:[1/2](9250/63764) loss:2.984 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 12:14:13,660][model2_sft.py][INFO] Epoch:[1/2](9300/63764) loss:3.273 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 12:14:24,826][model2_sft.py][INFO] Epoch:[1/2](9350/63764) loss:3.147 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 12:14:36,008][model2_sft.py][INFO] Epoch:[1/2](9400/63764) loss:3.522 lr:0.0000010 epoch_Time:203.0min: [2023-12-11 12:14:47,167][model2_sft.py][INFO] Epoch:[1/2](9450/63764) loss:3.381 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 12:14:58,319][model2_sft.py][INFO] Epoch:[1/2](9500/63764) loss:3.274 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 12:15:09,531][model2_sft.py][INFO] Epoch:[1/2](9550/63764) loss:3.393 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 12:15:20,664][model2_sft.py][INFO] Epoch:[1/2](9600/63764) loss:2.744 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 12:15:31,778][model2_sft.py][INFO] Epoch:[1/2](9650/63764) loss:3.446 lr:0.0000010 epoch_Time:202.0min: [2023-12-11 12:15:42,951][model2_sft.py][INFO] Epoch:[1/2](9700/63764) loss:2.957 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 12:15:54,097][model2_sft.py][INFO] Epoch:[1/2](9750/63764) loss:2.996 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 12:16:05,247][model2_sft.py][INFO] Epoch:[1/2](9800/63764) loss:2.915 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 12:16:16,409][model2_sft.py][INFO] Epoch:[1/2](9850/63764) loss:2.880 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 12:16:27,534][model2_sft.py][INFO] Epoch:[1/2](9900/63764) loss:2.699 lr:0.0000010 epoch_Time:201.0min: [2023-12-11 12:16:38,659][model2_sft.py][INFO] Epoch:[1/2](9950/63764) loss:2.815 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 12:16:49,782][model2_sft.py][INFO] Epoch:[1/2](10000/63764) loss:2.759 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 12:17:00,905][model2_sft.py][INFO] Epoch:[1/2](10050/63764) loss:2.751 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 12:17:12,006][model2_sft.py][INFO] Epoch:[1/2](10100/63764) loss:3.059 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 12:17:23,172][model2_sft.py][INFO] Epoch:[1/2](10150/63764) loss:3.249 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 12:17:34,324][model2_sft.py][INFO] Epoch:[1/2](10200/63764) loss:2.854 lr:0.0000010 epoch_Time:200.0min: [2023-12-11 12:17:45,452][model2_sft.py][INFO] Epoch:[1/2](10250/63764) loss:3.193 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 12:17:56,633][model2_sft.py][INFO] Epoch:[1/2](10300/63764) loss:2.816 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 12:18:07,796][model2_sft.py][INFO] Epoch:[1/2](10350/63764) loss:3.328 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 12:18:18,896][model2_sft.py][INFO] Epoch:[1/2](10400/63764) loss:3.406 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 12:18:30,048][model2_sft.py][INFO] Epoch:[1/2](10450/63764) loss:3.525 lr:0.0000010 epoch_Time:199.0min: [2023-12-11 12:18:41,176][model2_sft.py][INFO] Epoch:[1/2](10500/63764) loss:3.124 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 12:18:52,326][model2_sft.py][INFO] Epoch:[1/2](10550/63764) loss:3.319 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 12:19:03,494][model2_sft.py][INFO] Epoch:[1/2](10600/63764) loss:3.409 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 12:19:14,664][model2_sft.py][INFO] Epoch:[1/2](10650/63764) loss:3.031 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 12:19:25,844][model2_sft.py][INFO] Epoch:[1/2](10700/63764) loss:3.368 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 12:19:37,008][model2_sft.py][INFO] Epoch:[1/2](10750/63764) loss:2.769 lr:0.0000010 epoch_Time:198.0min: [2023-12-11 12:19:48,161][model2_sft.py][INFO] Epoch:[1/2](10800/63764) loss:3.140 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 12:19:59,402][model2_sft.py][INFO] Epoch:[1/2](10850/63764) loss:2.947 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 12:20:10,570][model2_sft.py][INFO] Epoch:[1/2](10900/63764) loss:2.989 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 12:20:21,766][model2_sft.py][INFO] Epoch:[1/2](10950/63764) loss:3.529 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 12:20:32,953][model2_sft.py][INFO] Epoch:[1/2](11000/63764) loss:2.815 lr:0.0000010 epoch_Time:197.0min: [2023-12-11 12:20:44,196][model2_sft.py][INFO] Epoch:[1/2](11050/63764) loss:3.592 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 12:20:55,356][model2_sft.py][INFO] Epoch:[1/2](11100/63764) loss:3.398 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 12:21:06,523][model2_sft.py][INFO] Epoch:[1/2](11150/63764) loss:2.354 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 12:21:17,686][model2_sft.py][INFO] Epoch:[1/2](11200/63764) loss:2.897 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 12:21:28,856][model2_sft.py][INFO] Epoch:[1/2](11250/63764) loss:4.329 lr:0.0000010 epoch_Time:196.0min: [2023-12-11 12:21:40,057][model2_sft.py][INFO] Epoch:[1/2](11300/63764) loss:3.530 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 12:21:51,224][model2_sft.py][INFO] Epoch:[1/2](11350/63764) loss:3.460 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 12:22:02,434][model2_sft.py][INFO] Epoch:[1/2](11400/63764) loss:3.109 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 12:22:13,609][model2_sft.py][INFO] Epoch:[1/2](11450/63764) loss:2.970 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 12:22:24,774][model2_sft.py][INFO] Epoch:[1/2](11500/63764) loss:3.340 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 12:22:35,936][model2_sft.py][INFO] Epoch:[1/2](11550/63764) loss:3.218 lr:0.0000010 epoch_Time:195.0min: [2023-12-11 12:22:47,068][model2_sft.py][INFO] Epoch:[1/2](11600/63764) loss:2.941 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 12:22:58,202][model2_sft.py][INFO] Epoch:[1/2](11650/63764) loss:3.612 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 12:23:09,365][model2_sft.py][INFO] Epoch:[1/2](11700/63764) loss:2.991 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 12:23:20,570][model2_sft.py][INFO] Epoch:[1/2](11750/63764) loss:3.408 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 12:23:31,721][model2_sft.py][INFO] Epoch:[1/2](11800/63764) loss:3.417 lr:0.0000010 epoch_Time:194.0min: [2023-12-11 12:23:42,900][model2_sft.py][INFO] Epoch:[1/2](11850/63764) loss:3.002 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 12:23:54,024][model2_sft.py][INFO] Epoch:[1/2](11900/63764) loss:2.854 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 12:24:05,172][model2_sft.py][INFO] Epoch:[1/2](11950/63764) loss:2.617 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 12:24:16,335][model2_sft.py][INFO] Epoch:[1/2](12000/63764) loss:3.395 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 12:24:27,473][model2_sft.py][INFO] Epoch:[1/2](12050/63764) loss:3.933 lr:0.0000010 epoch_Time:193.0min: [2023-12-11 12:24:38,640][model2_sft.py][INFO] Epoch:[1/2](12100/63764) loss:3.454 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 12:24:49,771][model2_sft.py][INFO] Epoch:[1/2](12150/63764) loss:3.064 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 12:25:00,908][model2_sft.py][INFO] Epoch:[1/2](12200/63764) loss:3.046 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 12:25:12,050][model2_sft.py][INFO] Epoch:[1/2](12250/63764) loss:3.062 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 12:25:23,209][model2_sft.py][INFO] Epoch:[1/2](12300/63764) loss:3.261 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 12:25:34,353][model2_sft.py][INFO] Epoch:[1/2](12350/63764) loss:3.499 lr:0.0000010 epoch_Time:192.0min: [2023-12-11 12:25:45,545][model2_sft.py][INFO] Epoch:[1/2](12400/63764) loss:3.088 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 12:25:56,701][model2_sft.py][INFO] Epoch:[1/2](12450/63764) loss:3.269 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 12:26:07,978][model2_sft.py][INFO] Epoch:[1/2](12500/63764) loss:2.890 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 12:26:19,173][model2_sft.py][INFO] Epoch:[1/2](12550/63764) loss:3.836 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 12:26:30,317][model2_sft.py][INFO] Epoch:[1/2](12600/63764) loss:3.216 lr:0.0000010 epoch_Time:191.0min: [2023-12-11 12:26:41,466][model2_sft.py][INFO] Epoch:[1/2](12650/63764) loss:2.862 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 12:26:52,642][model2_sft.py][INFO] Epoch:[1/2](12700/63764) loss:2.853 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 12:27:03,801][model2_sft.py][INFO] Epoch:[1/2](12750/63764) loss:3.628 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 12:27:14,947][model2_sft.py][INFO] Epoch:[1/2](12800/63764) loss:3.522 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 12:27:26,103][model2_sft.py][INFO] Epoch:[1/2](12850/63764) loss:3.014 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 12:27:37,243][model2_sft.py][INFO] Epoch:[1/2](12900/63764) loss:3.697 lr:0.0000010 epoch_Time:190.0min: [2023-12-11 12:27:48,415][model2_sft.py][INFO] Epoch:[1/2](12950/63764) loss:3.021 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 12:27:59,606][model2_sft.py][INFO] Epoch:[1/2](13000/63764) loss:3.020 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 12:28:10,805][model2_sft.py][INFO] Epoch:[1/2](13050/63764) loss:3.526 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 12:28:21,953][model2_sft.py][INFO] Epoch:[1/2](13100/63764) loss:3.488 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 12:28:33,100][model2_sft.py][INFO] Epoch:[1/2](13150/63764) loss:3.249 lr:0.0000010 epoch_Time:189.0min: [2023-12-11 12:28:44,229][model2_sft.py][INFO] Epoch:[1/2](13200/63764) loss:2.719 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 12:28:55,339][model2_sft.py][INFO] Epoch:[1/2](13250/63764) loss:3.011 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 12:29:06,540][model2_sft.py][INFO] Epoch:[1/2](13300/63764) loss:3.324 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 12:29:17,721][model2_sft.py][INFO] Epoch:[1/2](13350/63764) loss:3.850 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 12:29:28,873][model2_sft.py][INFO] Epoch:[1/2](13400/63764) loss:3.317 lr:0.0000010 epoch_Time:188.0min: [2023-12-11 12:29:40,006][model2_sft.py][INFO] Epoch:[1/2](13450/63764) loss:3.263 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 12:29:51,134][model2_sft.py][INFO] Epoch:[1/2](13500/63764) loss:3.357 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 12:30:02,329][model2_sft.py][INFO] Epoch:[1/2](13550/63764) loss:3.283 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 12:30:13,519][model2_sft.py][INFO] Epoch:[1/2](13600/63764) loss:2.963 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 12:30:24,676][model2_sft.py][INFO] Epoch:[1/2](13650/63764) loss:3.189 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 12:30:35,825][model2_sft.py][INFO] Epoch:[1/2](13700/63764) loss:3.116 lr:0.0000010 epoch_Time:187.0min: [2023-12-11 12:30:46,953][model2_sft.py][INFO] Epoch:[1/2](13750/63764) loss:2.848 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 12:30:58,090][model2_sft.py][INFO] Epoch:[1/2](13800/63764) loss:3.171 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 12:31:09,287][model2_sft.py][INFO] Epoch:[1/2](13850/63764) loss:2.499 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 12:31:20,462][model2_sft.py][INFO] Epoch:[1/2](13900/63764) loss:3.980 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 12:31:31,646][model2_sft.py][INFO] Epoch:[1/2](13950/63764) loss:3.411 lr:0.0000010 epoch_Time:186.0min: [2023-12-11 12:31:42,782][model2_sft.py][INFO] Epoch:[1/2](14000/63764) loss:3.394 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 12:31:53,945][model2_sft.py][INFO] Epoch:[1/2](14050/63764) loss:3.293 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 12:32:05,102][model2_sft.py][INFO] Epoch:[1/2](14100/63764) loss:4.025 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 12:32:16,307][model2_sft.py][INFO] Epoch:[1/2](14150/63764) loss:3.723 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 12:32:27,527][model2_sft.py][INFO] Epoch:[1/2](14200/63764) loss:3.674 lr:0.0000010 epoch_Time:185.0min: [2023-12-11 12:32:38,649][model2_sft.py][INFO] Epoch:[1/2](14250/63764) loss:3.119 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 12:32:49,807][model2_sft.py][INFO] Epoch:[1/2](14300/63764) loss:2.923 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 12:33:01,023][model2_sft.py][INFO] Epoch:[1/2](14350/63764) loss:3.544 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 12:33:12,187][model2_sft.py][INFO] Epoch:[1/2](14400/63764) loss:2.723 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 12:33:23,365][model2_sft.py][INFO] Epoch:[1/2](14450/63764) loss:3.059 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 12:33:34,515][model2_sft.py][INFO] Epoch:[1/2](14500/63764) loss:3.350 lr:0.0000010 epoch_Time:184.0min: [2023-12-11 12:33:45,697][model2_sft.py][INFO] Epoch:[1/2](14550/63764) loss:3.423 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 12:33:56,855][model2_sft.py][INFO] Epoch:[1/2](14600/63764) loss:2.960 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 12:34:08,059][model2_sft.py][INFO] Epoch:[1/2](14650/63764) loss:3.169 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 12:34:19,222][model2_sft.py][INFO] Epoch:[1/2](14700/63764) loss:3.168 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 12:34:30,409][model2_sft.py][INFO] Epoch:[1/2](14750/63764) loss:2.941 lr:0.0000010 epoch_Time:183.0min: [2023-12-11 12:34:41,553][model2_sft.py][INFO] Epoch:[1/2](14800/63764) loss:3.199 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 12:34:52,758][model2_sft.py][INFO] Epoch:[1/2](14850/63764) loss:3.039 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 12:35:03,891][model2_sft.py][INFO] Epoch:[1/2](14900/63764) loss:2.357 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 12:35:15,048][model2_sft.py][INFO] Epoch:[1/2](14950/63764) loss:3.437 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 12:35:26,230][model2_sft.py][INFO] Epoch:[1/2](15000/63764) loss:3.174 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 12:35:37,439][model2_sft.py][INFO] Epoch:[1/2](15050/63764) loss:3.482 lr:0.0000010 epoch_Time:182.0min: [2023-12-11 12:35:48,599][model2_sft.py][INFO] Epoch:[1/2](15100/63764) loss:3.519 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 12:35:59,735][model2_sft.py][INFO] Epoch:[1/2](15150/63764) loss:2.825 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 12:36:10,898][model2_sft.py][INFO] Epoch:[1/2](15200/63764) loss:3.559 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 12:36:22,078][model2_sft.py][INFO] Epoch:[1/2](15250/63764) loss:3.018 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 12:36:33,268][model2_sft.py][INFO] Epoch:[1/2](15300/63764) loss:3.307 lr:0.0000010 epoch_Time:181.0min: [2023-12-11 12:36:44,551][model2_sft.py][INFO] Epoch:[1/2](15350/63764) loss:2.908 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 12:36:55,748][model2_sft.py][INFO] Epoch:[1/2](15400/63764) loss:3.330 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 12:37:06,903][model2_sft.py][INFO] Epoch:[1/2](15450/63764) loss:3.576 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 12:37:18,131][model2_sft.py][INFO] Epoch:[1/2](15500/63764) loss:2.801 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 12:37:29,327][model2_sft.py][INFO] Epoch:[1/2](15550/63764) loss:2.585 lr:0.0000010 epoch_Time:180.0min: [2023-12-11 12:37:40,488][model2_sft.py][INFO] Epoch:[1/2](15600/63764) loss:2.534 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 12:37:51,662][model2_sft.py][INFO] Epoch:[1/2](15650/63764) loss:2.451 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 12:38:02,855][model2_sft.py][INFO] Epoch:[1/2](15700/63764) loss:3.712 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 12:38:14,041][model2_sft.py][INFO] Epoch:[1/2](15750/63764) loss:3.257 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 12:38:25,295][model2_sft.py][INFO] Epoch:[1/2](15800/63764) loss:3.968 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 12:38:36,465][model2_sft.py][INFO] Epoch:[1/2](15850/63764) loss:2.935 lr:0.0000010 epoch_Time:179.0min: [2023-12-11 12:38:47,636][model2_sft.py][INFO] Epoch:[1/2](15900/63764) loss:2.983 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 12:38:58,826][model2_sft.py][INFO] Epoch:[1/2](15950/63764) loss:3.086 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 12:39:10,030][model2_sft.py][INFO] Epoch:[1/2](16000/63764) loss:3.321 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 12:39:21,223][model2_sft.py][INFO] Epoch:[1/2](16050/63764) loss:3.341 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 12:39:32,430][model2_sft.py][INFO] Epoch:[1/2](16100/63764) loss:3.132 lr:0.0000010 epoch_Time:178.0min: [2023-12-11 12:39:43,593][model2_sft.py][INFO] Epoch:[1/2](16150/63764) loss:3.308 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 12:39:54,778][model2_sft.py][INFO] Epoch:[1/2](16200/63764) loss:3.273 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 12:40:05,969][model2_sft.py][INFO] Epoch:[1/2](16250/63764) loss:3.173 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 12:40:17,174][model2_sft.py][INFO] Epoch:[1/2](16300/63764) loss:3.025 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 12:40:28,415][model2_sft.py][INFO] Epoch:[1/2](16350/63764) loss:2.407 lr:0.0000010 epoch_Time:177.0min: [2023-12-11 12:40:39,609][model2_sft.py][INFO] Epoch:[1/2](16400/63764) loss:2.759 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 12:40:50,757][model2_sft.py][INFO] Epoch:[1/2](16450/63764) loss:2.562 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 12:41:01,926][model2_sft.py][INFO] Epoch:[1/2](16500/63764) loss:3.561 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 12:41:13,106][model2_sft.py][INFO] Epoch:[1/2](16550/63764) loss:3.047 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 12:41:24,334][model2_sft.py][INFO] Epoch:[1/2](16600/63764) loss:3.346 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 12:41:35,469][model2_sft.py][INFO] Epoch:[1/2](16650/63764) loss:3.480 lr:0.0000010 epoch_Time:176.0min: [2023-12-11 12:41:46,648][model2_sft.py][INFO] Epoch:[1/2](16700/63764) loss:3.300 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 12:41:57,804][model2_sft.py][INFO] Epoch:[1/2](16750/63764) loss:3.190 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 12:42:08,967][model2_sft.py][INFO] Epoch:[1/2](16800/63764) loss:3.865 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 12:42:20,144][model2_sft.py][INFO] Epoch:[1/2](16850/63764) loss:3.133 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 12:42:31,336][model2_sft.py][INFO] Epoch:[1/2](16900/63764) loss:3.248 lr:0.0000010 epoch_Time:175.0min: [2023-12-11 12:42:42,548][model2_sft.py][INFO] Epoch:[1/2](16950/63764) loss:3.017 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 12:42:53,764][model2_sft.py][INFO] Epoch:[1/2](17000/63764) loss:3.536 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 12:43:04,935][model2_sft.py][INFO] Epoch:[1/2](17050/63764) loss:3.290 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 12:43:16,212][model2_sft.py][INFO] Epoch:[1/2](17100/63764) loss:3.278 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 12:43:27,375][model2_sft.py][INFO] Epoch:[1/2](17150/63764) loss:3.452 lr:0.0000010 epoch_Time:174.0min: [2023-12-11 12:43:38,674][model2_sft.py][INFO] Epoch:[1/2](17200/63764) loss:2.928 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 12:43:49,827][model2_sft.py][INFO] Epoch:[1/2](17250/63764) loss:3.099 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 12:44:01,069][model2_sft.py][INFO] Epoch:[1/2](17300/63764) loss:2.786 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 12:44:12,306][model2_sft.py][INFO] Epoch:[1/2](17350/63764) loss:3.199 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 12:44:23,513][model2_sft.py][INFO] Epoch:[1/2](17400/63764) loss:2.914 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 12:44:34,745][model2_sft.py][INFO] Epoch:[1/2](17450/63764) loss:3.125 lr:0.0000010 epoch_Time:173.0min: [2023-12-11 12:44:45,965][model2_sft.py][INFO] Epoch:[1/2](17500/63764) loss:2.969 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 12:44:57,119][model2_sft.py][INFO] Epoch:[1/2](17550/63764) loss:2.605 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 12:45:08,322][model2_sft.py][INFO] Epoch:[1/2](17600/63764) loss:2.811 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 12:45:19,497][model2_sft.py][INFO] Epoch:[1/2](17650/63764) loss:3.026 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 12:45:30,640][model2_sft.py][INFO] Epoch:[1/2](17700/63764) loss:3.107 lr:0.0000010 epoch_Time:172.0min: [2023-12-11 12:45:41,793][model2_sft.py][INFO] Epoch:[1/2](17750/63764) loss:3.505 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 12:45:52,941][model2_sft.py][INFO] Epoch:[1/2](17800/63764) loss:2.733 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 12:46:04,068][model2_sft.py][INFO] Epoch:[1/2](17850/63764) loss:3.547 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 12:46:15,231][model2_sft.py][INFO] Epoch:[1/2](17900/63764) loss:3.554 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 12:46:26,469][model2_sft.py][INFO] Epoch:[1/2](17950/63764) loss:3.290 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 12:46:37,637][model2_sft.py][INFO] Epoch:[1/2](18000/63764) loss:3.587 lr:0.0000010 epoch_Time:171.0min: [2023-12-11 12:46:48,771][model2_sft.py][INFO] Epoch:[1/2](18050/63764) loss:3.388 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 12:46:59,899][model2_sft.py][INFO] Epoch:[1/2](18100/63764) loss:3.146 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 12:47:11,066][model2_sft.py][INFO] Epoch:[1/2](18150/63764) loss:3.572 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 12:47:22,247][model2_sft.py][INFO] Epoch:[1/2](18200/63764) loss:2.802 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 12:47:33,406][model2_sft.py][INFO] Epoch:[1/2](18250/63764) loss:2.892 lr:0.0000010 epoch_Time:170.0min: [2023-12-11 12:47:44,569][model2_sft.py][INFO] Epoch:[1/2](18300/63764) loss:3.042 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 12:47:55,743][model2_sft.py][INFO] Epoch:[1/2](18350/63764) loss:3.403 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 12:48:06,940][model2_sft.py][INFO] Epoch:[1/2](18400/63764) loss:3.322 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 12:48:18,164][model2_sft.py][INFO] Epoch:[1/2](18450/63764) loss:3.620 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 12:48:29,289][model2_sft.py][INFO] Epoch:[1/2](18500/63764) loss:3.103 lr:0.0000010 epoch_Time:169.0min: [2023-12-11 12:48:40,438][model2_sft.py][INFO] Epoch:[1/2](18550/63764) loss:2.826 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 12:48:51,567][model2_sft.py][INFO] Epoch:[1/2](18600/63764) loss:2.897 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 12:49:02,749][model2_sft.py][INFO] Epoch:[1/2](18650/63764) loss:3.588 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 12:49:13,917][model2_sft.py][INFO] Epoch:[1/2](18700/63764) loss:3.102 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 12:49:25,092][model2_sft.py][INFO] Epoch:[1/2](18750/63764) loss:2.672 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 12:49:36,249][model2_sft.py][INFO] Epoch:[1/2](18800/63764) loss:3.039 lr:0.0000010 epoch_Time:168.0min: [2023-12-11 12:49:47,459][model2_sft.py][INFO] Epoch:[1/2](18850/63764) loss:3.140 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 12:49:58,619][model2_sft.py][INFO] Epoch:[1/2](18900/63764) loss:2.898 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 12:50:09,761][model2_sft.py][INFO] Epoch:[1/2](18950/63764) loss:2.748 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 12:50:20,904][model2_sft.py][INFO] Epoch:[1/2](19000/63764) loss:2.604 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 12:50:32,084][model2_sft.py][INFO] Epoch:[1/2](19050/63764) loss:2.887 lr:0.0000010 epoch_Time:167.0min: [2023-12-11 12:50:43,316][model2_sft.py][INFO] Epoch:[1/2](19100/63764) loss:3.392 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 12:50:54,495][model2_sft.py][INFO] Epoch:[1/2](19150/63764) loss:2.883 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 12:51:05,689][model2_sft.py][INFO] Epoch:[1/2](19200/63764) loss:3.218 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 12:51:16,862][model2_sft.py][INFO] Epoch:[1/2](19250/63764) loss:3.169 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 12:51:27,983][model2_sft.py][INFO] Epoch:[1/2](19300/63764) loss:3.352 lr:0.0000010 epoch_Time:166.0min: [2023-12-11 12:51:39,225][model2_sft.py][INFO] Epoch:[1/2](19350/63764) loss:3.011 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 12:51:50,465][model2_sft.py][INFO] Epoch:[1/2](19400/63764) loss:3.423 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 12:52:01,598][model2_sft.py][INFO] Epoch:[1/2](19450/63764) loss:3.342 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 12:52:12,804][model2_sft.py][INFO] Epoch:[1/2](19500/63764) loss:3.038 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 12:52:23,975][model2_sft.py][INFO] Epoch:[1/2](19550/63764) loss:3.707 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 12:52:35,147][model2_sft.py][INFO] Epoch:[1/2](19600/63764) loss:3.362 lr:0.0000010 epoch_Time:165.0min: [2023-12-11 12:52:46,356][model2_sft.py][INFO] Epoch:[1/2](19650/63764) loss:3.031 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 12:52:57,577][model2_sft.py][INFO] Epoch:[1/2](19700/63764) loss:2.883 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 12:53:08,789][model2_sft.py][INFO] Epoch:[1/2](19750/63764) loss:2.933 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 12:53:19,987][model2_sft.py][INFO] Epoch:[1/2](19800/63764) loss:3.562 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 12:53:31,179][model2_sft.py][INFO] Epoch:[1/2](19850/63764) loss:3.402 lr:0.0000010 epoch_Time:164.0min: [2023-12-11 12:53:42,335][model2_sft.py][INFO] Epoch:[1/2](19900/63764) loss:3.509 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 12:53:53,539][model2_sft.py][INFO] Epoch:[1/2](19950/63764) loss:2.907 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 12:54:04,774][model2_sft.py][INFO] Epoch:[1/2](20000/63764) loss:3.073 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 12:54:16,043][model2_sft.py][INFO] Epoch:[1/2](20050/63764) loss:3.520 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 12:54:27,270][model2_sft.py][INFO] Epoch:[1/2](20100/63764) loss:2.877 lr:0.0000010 epoch_Time:163.0min: [2023-12-11 12:54:38,432][model2_sft.py][INFO] Epoch:[1/2](20150/63764) loss:3.366 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 12:54:49,564][model2_sft.py][INFO] Epoch:[1/2](20200/63764) loss:3.361 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 12:55:00,754][model2_sft.py][INFO] Epoch:[1/2](20250/63764) loss:3.495 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 12:55:11,973][model2_sft.py][INFO] Epoch:[1/2](20300/63764) loss:3.467 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 12:55:23,139][model2_sft.py][INFO] Epoch:[1/2](20350/63764) loss:3.498 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 12:55:34,359][model2_sft.py][INFO] Epoch:[1/2](20400/63764) loss:3.541 lr:0.0000010 epoch_Time:162.0min: [2023-12-11 12:55:45,511][model2_sft.py][INFO] Epoch:[1/2](20450/63764) loss:3.951 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 12:55:56,629][model2_sft.py][INFO] Epoch:[1/2](20500/63764) loss:3.653 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 12:56:07,764][model2_sft.py][INFO] Epoch:[1/2](20550/63764) loss:3.063 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 12:56:19,003][model2_sft.py][INFO] Epoch:[1/2](20600/63764) loss:2.968 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 12:56:30,191][model2_sft.py][INFO] Epoch:[1/2](20650/63764) loss:3.293 lr:0.0000010 epoch_Time:161.0min: [2023-12-11 12:56:41,346][model2_sft.py][INFO] Epoch:[1/2](20700/63764) loss:3.063 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 12:56:52,492][model2_sft.py][INFO] Epoch:[1/2](20750/63764) loss:3.339 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 12:57:03,635][model2_sft.py][INFO] Epoch:[1/2](20800/63764) loss:3.194 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 12:57:14,828][model2_sft.py][INFO] Epoch:[1/2](20850/63764) loss:3.526 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 12:57:26,038][model2_sft.py][INFO] Epoch:[1/2](20900/63764) loss:3.522 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 12:57:37,228][model2_sft.py][INFO] Epoch:[1/2](20950/63764) loss:3.101 lr:0.0000010 epoch_Time:160.0min: [2023-12-11 12:57:48,432][model2_sft.py][INFO] Epoch:[1/2](21000/63764) loss:3.196 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 12:57:59,594][model2_sft.py][INFO] Epoch:[1/2](21050/63764) loss:2.806 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 12:58:10,779][model2_sft.py][INFO] Epoch:[1/2](21100/63764) loss:2.948 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 12:58:21,934][model2_sft.py][INFO] Epoch:[1/2](21150/63764) loss:3.063 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 12:58:33,129][model2_sft.py][INFO] Epoch:[1/2](21200/63764) loss:2.757 lr:0.0000010 epoch_Time:159.0min: [2023-12-11 12:58:44,302][model2_sft.py][INFO] Epoch:[1/2](21250/63764) loss:3.022 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 12:58:55,485][model2_sft.py][INFO] Epoch:[1/2](21300/63764) loss:3.761 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 12:59:06,666][model2_sft.py][INFO] Epoch:[1/2](21350/63764) loss:3.068 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 12:59:17,840][model2_sft.py][INFO] Epoch:[1/2](21400/63764) loss:2.778 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 12:59:29,017][model2_sft.py][INFO] Epoch:[1/2](21450/63764) loss:3.162 lr:0.0000010 epoch_Time:158.0min: [2023-12-11 12:59:40,234][model2_sft.py][INFO] Epoch:[1/2](21500/63764) loss:2.936 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 12:59:51,449][model2_sft.py][INFO] Epoch:[1/2](21550/63764) loss:3.382 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 13:00:02,651][model2_sft.py][INFO] Epoch:[1/2](21600/63764) loss:3.248 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 13:00:13,843][model2_sft.py][INFO] Epoch:[1/2](21650/63764) loss:3.232 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 13:00:25,022][model2_sft.py][INFO] Epoch:[1/2](21700/63764) loss:3.511 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 13:00:36,199][model2_sft.py][INFO] Epoch:[1/2](21750/63764) loss:2.642 lr:0.0000010 epoch_Time:157.0min: [2023-12-11 13:00:47,396][model2_sft.py][INFO] Epoch:[1/2](21800/63764) loss:3.729 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 13:00:58,544][model2_sft.py][INFO] Epoch:[1/2](21850/63764) loss:2.962 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 13:01:09,682][model2_sft.py][INFO] Epoch:[1/2](21900/63764) loss:3.743 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 13:01:20,846][model2_sft.py][INFO] Epoch:[1/2](21950/63764) loss:3.182 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 13:01:31,979][model2_sft.py][INFO] Epoch:[1/2](22000/63764) loss:3.023 lr:0.0000010 epoch_Time:156.0min: [2023-12-11 13:01:43,182][model2_sft.py][INFO] Epoch:[1/2](22050/63764) loss:3.056 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 13:01:54,374][model2_sft.py][INFO] Epoch:[1/2](22100/63764) loss:3.283 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 13:02:05,564][model2_sft.py][INFO] Epoch:[1/2](22150/63764) loss:2.311 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 13:02:16,792][model2_sft.py][INFO] Epoch:[1/2](22200/63764) loss:3.067 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 13:02:27,971][model2_sft.py][INFO] Epoch:[1/2](22250/63764) loss:3.024 lr:0.0000010 epoch_Time:155.0min: [2023-12-11 13:02:39,153][model2_sft.py][INFO] Epoch:[1/2](22300/63764) loss:2.953 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 13:02:50,288][model2_sft.py][INFO] Epoch:[1/2](22350/63764) loss:3.481 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 13:03:01,483][model2_sft.py][INFO] Epoch:[1/2](22400/63764) loss:3.545 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 13:03:12,694][model2_sft.py][INFO] Epoch:[1/2](22450/63764) loss:3.175 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 13:03:23,919][model2_sft.py][INFO] Epoch:[1/2](22500/63764) loss:3.104 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 13:03:35,109][model2_sft.py][INFO] Epoch:[1/2](22550/63764) loss:2.887 lr:0.0000010 epoch_Time:154.0min: [2023-12-11 13:03:46,277][model2_sft.py][INFO] Epoch:[1/2](22600/63764) loss:3.103 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 13:03:57,460][model2_sft.py][INFO] Epoch:[1/2](22650/63764) loss:3.102 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 13:04:08,652][model2_sft.py][INFO] Epoch:[1/2](22700/63764) loss:3.115 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 13:04:19,907][model2_sft.py][INFO] Epoch:[1/2](22750/63764) loss:2.533 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 13:04:31,099][model2_sft.py][INFO] Epoch:[1/2](22800/63764) loss:3.213 lr:0.0000010 epoch_Time:153.0min: [2023-12-11 13:04:42,271][model2_sft.py][INFO] Epoch:[1/2](22850/63764) loss:3.017 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 13:04:53,474][model2_sft.py][INFO] Epoch:[1/2](22900/63764) loss:2.987 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 13:05:04,678][model2_sft.py][INFO] Epoch:[1/2](22950/63764) loss:1.932 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 13:05:15,834][model2_sft.py][INFO] Epoch:[1/2](23000/63764) loss:3.782 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 13:05:27,119][model2_sft.py][INFO] Epoch:[1/2](23050/63764) loss:3.407 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 13:05:38,313][model2_sft.py][INFO] Epoch:[1/2](23100/63764) loss:3.044 lr:0.0000010 epoch_Time:152.0min: [2023-12-11 13:05:49,518][model2_sft.py][INFO] Epoch:[1/2](23150/63764) loss:3.627 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 13:06:00,701][model2_sft.py][INFO] Epoch:[1/2](23200/63764) loss:3.119 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 13:06:11,879][model2_sft.py][INFO] Epoch:[1/2](23250/63764) loss:2.756 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 13:06:23,083][model2_sft.py][INFO] Epoch:[1/2](23300/63764) loss:2.613 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 13:06:34,304][model2_sft.py][INFO] Epoch:[1/2](23350/63764) loss:3.059 lr:0.0000010 epoch_Time:151.0min: [2023-12-11 13:06:45,494][model2_sft.py][INFO] Epoch:[1/2](23400/63764) loss:3.316 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 13:06:56,666][model2_sft.py][INFO] Epoch:[1/2](23450/63764) loss:3.221 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 13:07:07,889][model2_sft.py][INFO] Epoch:[1/2](23500/63764) loss:3.067 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 13:07:19,055][model2_sft.py][INFO] Epoch:[1/2](23550/63764) loss:2.626 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 13:07:30,201][model2_sft.py][INFO] Epoch:[1/2](23600/63764) loss:2.798 lr:0.0000010 epoch_Time:150.0min: [2023-12-11 13:07:41,397][model2_sft.py][INFO] Epoch:[1/2](23650/63764) loss:2.596 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 13:07:52,556][model2_sft.py][INFO] Epoch:[1/2](23700/63764) loss:3.060 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 13:08:03,735][model2_sft.py][INFO] Epoch:[1/2](23750/63764) loss:3.425 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 13:08:14,935][model2_sft.py][INFO] Epoch:[1/2](23800/63764) loss:2.864 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 13:08:26,146][model2_sft.py][INFO] Epoch:[1/2](23850/63764) loss:3.464 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 13:08:37,318][model2_sft.py][INFO] Epoch:[1/2](23900/63764) loss:2.819 lr:0.0000010 epoch_Time:149.0min: [2023-12-11 13:08:48,527][model2_sft.py][INFO] Epoch:[1/2](23950/63764) loss:3.542 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 13:08:59,727][model2_sft.py][INFO] Epoch:[1/2](24000/63764) loss:3.296 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 13:09:10,921][model2_sft.py][INFO] Epoch:[1/2](24050/63764) loss:3.100 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 13:09:22,099][model2_sft.py][INFO] Epoch:[1/2](24100/63764) loss:3.358 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 13:09:33,265][model2_sft.py][INFO] Epoch:[1/2](24150/63764) loss:3.838 lr:0.0000010 epoch_Time:148.0min: [2023-12-11 13:09:44,458][model2_sft.py][INFO] Epoch:[1/2](24200/63764) loss:3.081 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 13:09:55,629][model2_sft.py][INFO] Epoch:[1/2](24250/63764) loss:3.321 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 13:10:06,869][model2_sft.py][INFO] Epoch:[1/2](24300/63764) loss:3.513 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 13:10:18,026][model2_sft.py][INFO] Epoch:[1/2](24350/63764) loss:2.375 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 13:10:29,243][model2_sft.py][INFO] Epoch:[1/2](24400/63764) loss:2.808 lr:0.0000010 epoch_Time:147.0min: [2023-12-11 13:10:40,433][model2_sft.py][INFO] Epoch:[1/2](24450/63764) loss:3.443 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 13:10:51,565][model2_sft.py][INFO] Epoch:[1/2](24500/63764) loss:3.103 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 13:11:02,769][model2_sft.py][INFO] Epoch:[1/2](24550/63764) loss:3.777 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 13:11:13,971][model2_sft.py][INFO] Epoch:[1/2](24600/63764) loss:3.448 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 13:11:25,175][model2_sft.py][INFO] Epoch:[1/2](24650/63764) loss:3.420 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 13:11:36,329][model2_sft.py][INFO] Epoch:[1/2](24700/63764) loss:2.994 lr:0.0000010 epoch_Time:146.0min: [2023-12-11 13:11:47,489][model2_sft.py][INFO] Epoch:[1/2](24750/63764) loss:2.918 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 13:11:58,657][model2_sft.py][INFO] Epoch:[1/2](24800/63764) loss:3.063 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 13:12:09,861][model2_sft.py][INFO] Epoch:[1/2](24850/63764) loss:2.922 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 13:12:21,086][model2_sft.py][INFO] Epoch:[1/2](24900/63764) loss:3.061 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 13:12:32,303][model2_sft.py][INFO] Epoch:[1/2](24950/63764) loss:2.808 lr:0.0000010 epoch_Time:145.0min: [2023-12-11 13:12:43,489][model2_sft.py][INFO] Epoch:[1/2](25000/63764) loss:2.426 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 13:12:54,682][model2_sft.py][INFO] Epoch:[1/2](25050/63764) loss:4.036 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 13:13:05,873][model2_sft.py][INFO] Epoch:[1/2](25100/63764) loss:2.768 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 13:13:17,053][model2_sft.py][INFO] Epoch:[1/2](25150/63764) loss:3.403 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 13:13:28,231][model2_sft.py][INFO] Epoch:[1/2](25200/63764) loss:2.192 lr:0.0000010 epoch_Time:144.0min: [2023-12-11 13:13:39,410][model2_sft.py][INFO] Epoch:[1/2](25250/63764) loss:2.868 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 13:13:50,588][model2_sft.py][INFO] Epoch:[1/2](25300/63764) loss:3.550 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 13:14:01,773][model2_sft.py][INFO] Epoch:[1/2](25350/63764) loss:3.134 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 13:14:12,916][model2_sft.py][INFO] Epoch:[1/2](25400/63764) loss:3.511 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 13:14:24,054][model2_sft.py][INFO] Epoch:[1/2](25450/63764) loss:3.160 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 13:14:35,272][model2_sft.py][INFO] Epoch:[1/2](25500/63764) loss:3.522 lr:0.0000010 epoch_Time:143.0min: [2023-12-11 13:14:46,416][model2_sft.py][INFO] Epoch:[1/2](25550/63764) loss:3.679 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 13:14:57,611][model2_sft.py][INFO] Epoch:[1/2](25600/63764) loss:3.276 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 13:15:08,802][model2_sft.py][INFO] Epoch:[1/2](25650/63764) loss:2.989 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 13:15:19,998][model2_sft.py][INFO] Epoch:[1/2](25700/63764) loss:3.737 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 13:15:31,203][model2_sft.py][INFO] Epoch:[1/2](25750/63764) loss:2.967 lr:0.0000010 epoch_Time:142.0min: [2023-12-11 13:15:42,420][model2_sft.py][INFO] Epoch:[1/2](25800/63764) loss:3.918 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 13:15:53,587][model2_sft.py][INFO] Epoch:[1/2](25850/63764) loss:2.978 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 13:16:04,753][model2_sft.py][INFO] Epoch:[1/2](25900/63764) loss:3.102 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 13:16:15,950][model2_sft.py][INFO] Epoch:[1/2](25950/63764) loss:3.153 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 13:16:27,130][model2_sft.py][INFO] Epoch:[1/2](26000/63764) loss:3.152 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 13:16:38,277][model2_sft.py][INFO] Epoch:[1/2](26050/63764) loss:2.993 lr:0.0000010 epoch_Time:141.0min: [2023-12-11 13:16:49,478][model2_sft.py][INFO] Epoch:[1/2](26100/63764) loss:3.535 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 13:17:00,684][model2_sft.py][INFO] Epoch:[1/2](26150/63764) loss:3.452 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 13:17:11,872][model2_sft.py][INFO] Epoch:[1/2](26200/63764) loss:3.193 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 13:17:23,081][model2_sft.py][INFO] Epoch:[1/2](26250/63764) loss:3.275 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 13:17:34,222][model2_sft.py][INFO] Epoch:[1/2](26300/63764) loss:2.910 lr:0.0000010 epoch_Time:140.0min: [2023-12-11 13:17:45,393][model2_sft.py][INFO] Epoch:[1/2](26350/63764) loss:3.465 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 13:17:56,566][model2_sft.py][INFO] Epoch:[1/2](26400/63764) loss:2.932 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 13:18:07,775][model2_sft.py][INFO] Epoch:[1/2](26450/63764) loss:3.310 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 13:18:18,988][model2_sft.py][INFO] Epoch:[1/2](26500/63764) loss:3.243 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 13:18:30,182][model2_sft.py][INFO] Epoch:[1/2](26550/63764) loss:2.520 lr:0.0000010 epoch_Time:139.0min: [2023-12-11 13:18:41,334][model2_sft.py][INFO] Epoch:[1/2](26600/63764) loss:3.399 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 13:18:52,511][model2_sft.py][INFO] Epoch:[1/2](26650/63764) loss:2.921 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 13:19:03,721][model2_sft.py][INFO] Epoch:[1/2](26700/63764) loss:2.683 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 13:19:14,886][model2_sft.py][INFO] Epoch:[1/2](26750/63764) loss:3.259 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 13:19:26,018][model2_sft.py][INFO] Epoch:[1/2](26800/63764) loss:3.587 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 13:19:37,265][model2_sft.py][INFO] Epoch:[1/2](26850/63764) loss:2.676 lr:0.0000010 epoch_Time:138.0min: [2023-12-11 13:19:48,430][model2_sft.py][INFO] Epoch:[1/2](26900/63764) loss:3.977 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 13:19:59,605][model2_sft.py][INFO] Epoch:[1/2](26950/63764) loss:3.039 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 13:20:10,807][model2_sft.py][INFO] Epoch:[1/2](27000/63764) loss:3.550 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 13:20:22,002][model2_sft.py][INFO] Epoch:[1/2](27050/63764) loss:3.083 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 13:20:33,257][model2_sft.py][INFO] Epoch:[1/2](27100/63764) loss:3.090 lr:0.0000010 epoch_Time:137.0min: [2023-12-11 13:20:44,486][model2_sft.py][INFO] Epoch:[1/2](27150/63764) loss:3.301 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 13:20:55,656][model2_sft.py][INFO] Epoch:[1/2](27200/63764) loss:2.339 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 13:21:06,827][model2_sft.py][INFO] Epoch:[1/2](27250/63764) loss:2.934 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 13:21:17,983][model2_sft.py][INFO] Epoch:[1/2](27300/63764) loss:3.582 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 13:21:29,172][model2_sft.py][INFO] Epoch:[1/2](27350/63764) loss:2.408 lr:0.0000010 epoch_Time:136.0min: [2023-12-11 13:21:40,309][model2_sft.py][INFO] Epoch:[1/2](27400/63764) loss:3.541 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 13:21:51,500][model2_sft.py][INFO] Epoch:[1/2](27450/63764) loss:3.400 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 13:22:02,672][model2_sft.py][INFO] Epoch:[1/2](27500/63764) loss:2.876 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 13:22:13,865][model2_sft.py][INFO] Epoch:[1/2](27550/63764) loss:3.177 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 13:22:25,005][model2_sft.py][INFO] Epoch:[1/2](27600/63764) loss:3.088 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 13:22:36,203][model2_sft.py][INFO] Epoch:[1/2](27650/63764) loss:2.969 lr:0.0000010 epoch_Time:135.0min: [2023-12-11 13:22:47,349][model2_sft.py][INFO] Epoch:[1/2](27700/63764) loss:3.242 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 13:22:58,566][model2_sft.py][INFO] Epoch:[1/2](27750/63764) loss:2.855 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 13:23:09,728][model2_sft.py][INFO] Epoch:[1/2](27800/63764) loss:3.318 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 13:23:20,939][model2_sft.py][INFO] Epoch:[1/2](27850/63764) loss:2.633 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 13:23:32,081][model2_sft.py][INFO] Epoch:[1/2](27900/63764) loss:3.151 lr:0.0000010 epoch_Time:134.0min: [2023-12-11 13:23:43,275][model2_sft.py][INFO] Epoch:[1/2](27950/63764) loss:3.510 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 13:23:54,446][model2_sft.py][INFO] Epoch:[1/2](28000/63764) loss:3.451 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 13:24:05,638][model2_sft.py][INFO] Epoch:[1/2](28050/63764) loss:2.796 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 13:24:16,747][model2_sft.py][INFO] Epoch:[1/2](28100/63764) loss:3.355 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 13:24:27,913][model2_sft.py][INFO] Epoch:[1/2](28150/63764) loss:2.606 lr:0.0000010 epoch_Time:133.0min: [2023-12-11 13:24:39,118][model2_sft.py][INFO] Epoch:[1/2](28200/63764) loss:2.904 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 13:24:50,288][model2_sft.py][INFO] Epoch:[1/2](28250/63764) loss:2.783 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 13:25:01,462][model2_sft.py][INFO] Epoch:[1/2](28300/63764) loss:3.858 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 13:25:12,613][model2_sft.py][INFO] Epoch:[1/2](28350/63764) loss:3.173 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 13:25:23,825][model2_sft.py][INFO] Epoch:[1/2](28400/63764) loss:3.490 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 13:25:34,992][model2_sft.py][INFO] Epoch:[1/2](28450/63764) loss:3.129 lr:0.0000010 epoch_Time:132.0min: [2023-12-11 13:25:46,160][model2_sft.py][INFO] Epoch:[1/2](28500/63764) loss:3.189 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 13:25:57,316][model2_sft.py][INFO] Epoch:[1/2](28550/63764) loss:3.266 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 13:26:08,541][model2_sft.py][INFO] Epoch:[1/2](28600/63764) loss:3.334 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 13:26:19,743][model2_sft.py][INFO] Epoch:[1/2](28650/63764) loss:3.743 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 13:26:30,930][model2_sft.py][INFO] Epoch:[1/2](28700/63764) loss:3.346 lr:0.0000010 epoch_Time:131.0min: [2023-12-11 13:26:42,108][model2_sft.py][INFO] Epoch:[1/2](28750/63764) loss:3.131 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 13:26:53,284][model2_sft.py][INFO] Epoch:[1/2](28800/63764) loss:3.108 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 13:27:04,478][model2_sft.py][INFO] Epoch:[1/2](28850/63764) loss:2.995 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 13:27:15,633][model2_sft.py][INFO] Epoch:[1/2](28900/63764) loss:3.801 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 13:27:26,806][model2_sft.py][INFO] Epoch:[1/2](28950/63764) loss:3.443 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 13:27:37,981][model2_sft.py][INFO] Epoch:[1/2](29000/63764) loss:2.703 lr:0.0000010 epoch_Time:130.0min: [2023-12-11 13:27:49,161][model2_sft.py][INFO] Epoch:[1/2](29050/63764) loss:3.439 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 13:28:00,320][model2_sft.py][INFO] Epoch:[1/2](29100/63764) loss:2.797 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 13:28:11,489][model2_sft.py][INFO] Epoch:[1/2](29150/63764) loss:3.239 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 13:28:22,660][model2_sft.py][INFO] Epoch:[1/2](29200/63764) loss:3.288 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 13:28:33,899][model2_sft.py][INFO] Epoch:[1/2](29250/63764) loss:3.256 lr:0.0000010 epoch_Time:129.0min: [2023-12-11 13:28:45,109][model2_sft.py][INFO] Epoch:[1/2](29300/63764) loss:3.089 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 13:28:56,292][model2_sft.py][INFO] Epoch:[1/2](29350/63764) loss:2.465 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 13:29:07,502][model2_sft.py][INFO] Epoch:[1/2](29400/63764) loss:3.202 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 13:29:18,668][model2_sft.py][INFO] Epoch:[1/2](29450/63764) loss:3.302 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 13:29:29,806][model2_sft.py][INFO] Epoch:[1/2](29500/63764) loss:3.420 lr:0.0000010 epoch_Time:128.0min: [2023-12-11 13:29:40,967][model2_sft.py][INFO] Epoch:[1/2](29550/63764) loss:3.565 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 13:29:52,155][model2_sft.py][INFO] Epoch:[1/2](29600/63764) loss:2.942 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 13:30:03,267][model2_sft.py][INFO] Epoch:[1/2](29650/63764) loss:3.090 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 13:30:14,396][model2_sft.py][INFO] Epoch:[1/2](29700/63764) loss:3.488 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 13:30:25,567][model2_sft.py][INFO] Epoch:[1/2](29750/63764) loss:3.127 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 13:30:36,696][model2_sft.py][INFO] Epoch:[1/2](29800/63764) loss:3.656 lr:0.0000010 epoch_Time:127.0min: [2023-12-11 13:30:47,848][model2_sft.py][INFO] Epoch:[1/2](29850/63764) loss:2.616 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 13:30:59,083][model2_sft.py][INFO] Epoch:[1/2](29900/63764) loss:2.652 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 13:31:10,308][model2_sft.py][INFO] Epoch:[1/2](29950/63764) loss:3.206 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 13:31:21,481][model2_sft.py][INFO] Epoch:[1/2](30000/63764) loss:2.987 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 13:31:32,624][model2_sft.py][INFO] Epoch:[1/2](30050/63764) loss:3.311 lr:0.0000010 epoch_Time:126.0min: [2023-12-11 13:31:43,757][model2_sft.py][INFO] Epoch:[1/2](30100/63764) loss:2.774 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 13:31:54,870][model2_sft.py][INFO] Epoch:[1/2](30150/63764) loss:3.792 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 13:32:06,012][model2_sft.py][INFO] Epoch:[1/2](30200/63764) loss:3.142 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 13:32:17,152][model2_sft.py][INFO] Epoch:[1/2](30250/63764) loss:3.758 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 13:32:28,322][model2_sft.py][INFO] Epoch:[1/2](30300/63764) loss:3.261 lr:0.0000010 epoch_Time:125.0min: [2023-12-11 13:32:39,463][model2_sft.py][INFO] Epoch:[1/2](30350/63764) loss:3.256 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 13:32:50,647][model2_sft.py][INFO] Epoch:[1/2](30400/63764) loss:3.452 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 13:33:01,886][model2_sft.py][INFO] Epoch:[1/2](30450/63764) loss:2.890 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 13:33:13,090][model2_sft.py][INFO] Epoch:[1/2](30500/63764) loss:2.925 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 13:33:24,289][model2_sft.py][INFO] Epoch:[1/2](30550/63764) loss:2.923 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 13:33:35,492][model2_sft.py][INFO] Epoch:[1/2](30600/63764) loss:3.280 lr:0.0000010 epoch_Time:124.0min: [2023-12-11 13:33:46,622][model2_sft.py][INFO] Epoch:[1/2](30650/63764) loss:2.686 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 13:33:57,757][model2_sft.py][INFO] Epoch:[1/2](30700/63764) loss:2.933 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 13:34:08,923][model2_sft.py][INFO] Epoch:[1/2](30750/63764) loss:3.131 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 13:34:20,122][model2_sft.py][INFO] Epoch:[1/2](30800/63764) loss:3.986 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 13:34:31,321][model2_sft.py][INFO] Epoch:[1/2](30850/63764) loss:3.024 lr:0.0000010 epoch_Time:123.0min: [2023-12-11 13:34:42,440][model2_sft.py][INFO] Epoch:[1/2](30900/63764) loss:3.890 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 13:34:53,542][model2_sft.py][INFO] Epoch:[1/2](30950/63764) loss:3.139 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 13:35:04,699][model2_sft.py][INFO] Epoch:[1/2](31000/63764) loss:2.476 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 13:35:15,821][model2_sft.py][INFO] Epoch:[1/2](31050/63764) loss:3.882 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 13:35:26,958][model2_sft.py][INFO] Epoch:[1/2](31100/63764) loss:3.589 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 13:35:38,108][model2_sft.py][INFO] Epoch:[1/2](31150/63764) loss:3.330 lr:0.0000010 epoch_Time:122.0min: [2023-12-11 13:35:49,262][model2_sft.py][INFO] Epoch:[1/2](31200/63764) loss:3.614 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 13:36:00,381][model2_sft.py][INFO] Epoch:[1/2](31250/63764) loss:3.062 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 13:36:11,461][model2_sft.py][INFO] Epoch:[1/2](31300/63764) loss:3.005 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 13:36:22,591][model2_sft.py][INFO] Epoch:[1/2](31350/63764) loss:2.781 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 13:36:33,673][model2_sft.py][INFO] Epoch:[1/2](31400/63764) loss:3.076 lr:0.0000010 epoch_Time:121.0min: [2023-12-11 13:36:44,804][model2_sft.py][INFO] Epoch:[1/2](31450/63764) loss:3.169 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 13:36:55,910][model2_sft.py][INFO] Epoch:[1/2](31500/63764) loss:3.716 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 13:37:07,005][model2_sft.py][INFO] Epoch:[1/2](31550/63764) loss:3.226 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 13:37:18,152][model2_sft.py][INFO] Epoch:[1/2](31600/63764) loss:2.739 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 13:37:29,257][model2_sft.py][INFO] Epoch:[1/2](31650/63764) loss:2.834 lr:0.0000010 epoch_Time:120.0min: [2023-12-11 13:37:40,398][model2_sft.py][INFO] Epoch:[1/2](31700/63764) loss:2.963 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 13:37:51,570][model2_sft.py][INFO] Epoch:[1/2](31750/63764) loss:3.419 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 13:38:02,715][model2_sft.py][INFO] Epoch:[1/2](31800/63764) loss:3.088 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 13:38:13,869][model2_sft.py][INFO] Epoch:[1/2](31850/63764) loss:2.894 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 13:38:25,034][model2_sft.py][INFO] Epoch:[1/2](31900/63764) loss:3.498 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 13:38:36,201][model2_sft.py][INFO] Epoch:[1/2](31950/63764) loss:2.854 lr:0.0000010 epoch_Time:119.0min: [2023-12-11 13:38:47,330][model2_sft.py][INFO] Epoch:[1/2](32000/63764) loss:4.243 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 13:38:58,449][model2_sft.py][INFO] Epoch:[1/2](32050/63764) loss:3.175 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 13:39:09,616][model2_sft.py][INFO] Epoch:[1/2](32100/63764) loss:3.699 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 13:39:20,831][model2_sft.py][INFO] Epoch:[1/2](32150/63764) loss:2.919 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 13:39:32,019][model2_sft.py][INFO] Epoch:[1/2](32200/63764) loss:3.442 lr:0.0000010 epoch_Time:118.0min: [2023-12-11 13:39:43,195][model2_sft.py][INFO] Epoch:[1/2](32250/63764) loss:3.471 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 13:39:54,325][model2_sft.py][INFO] Epoch:[1/2](32300/63764) loss:3.656 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 13:40:05,532][model2_sft.py][INFO] Epoch:[1/2](32350/63764) loss:3.019 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 13:40:16,710][model2_sft.py][INFO] Epoch:[1/2](32400/63764) loss:3.175 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 13:40:27,903][model2_sft.py][INFO] Epoch:[1/2](32450/63764) loss:2.247 lr:0.0000010 epoch_Time:117.0min: [2023-12-11 13:40:39,108][model2_sft.py][INFO] Epoch:[1/2](32500/63764) loss:3.643 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 13:40:50,263][model2_sft.py][INFO] Epoch:[1/2](32550/63764) loss:3.360 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 13:41:01,425][model2_sft.py][INFO] Epoch:[1/2](32600/63764) loss:2.986 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 13:41:12,643][model2_sft.py][INFO] Epoch:[1/2](32650/63764) loss:3.071 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 13:41:23,847][model2_sft.py][INFO] Epoch:[1/2](32700/63764) loss:3.188 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 13:41:34,935][model2_sft.py][INFO] Epoch:[1/2](32750/63764) loss:2.669 lr:0.0000010 epoch_Time:116.0min: [2023-12-11 13:41:46,053][model2_sft.py][INFO] Epoch:[1/2](32800/63764) loss:3.438 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 13:41:57,272][model2_sft.py][INFO] Epoch:[1/2](32850/63764) loss:2.800 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 13:42:08,450][model2_sft.py][INFO] Epoch:[1/2](32900/63764) loss:2.616 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 13:42:19,641][model2_sft.py][INFO] Epoch:[1/2](32950/63764) loss:3.488 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 13:42:30,813][model2_sft.py][INFO] Epoch:[1/2](33000/63764) loss:2.900 lr:0.0000010 epoch_Time:115.0min: [2023-12-11 13:42:41,987][model2_sft.py][INFO] Epoch:[1/2](33050/63764) loss:3.406 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 13:42:53,150][model2_sft.py][INFO] Epoch:[1/2](33100/63764) loss:3.049 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 13:43:04,375][model2_sft.py][INFO] Epoch:[1/2](33150/63764) loss:3.464 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 13:43:15,527][model2_sft.py][INFO] Epoch:[1/2](33200/63764) loss:3.538 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 13:43:26,674][model2_sft.py][INFO] Epoch:[1/2](33250/63764) loss:3.068 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 13:43:37,846][model2_sft.py][INFO] Epoch:[1/2](33300/63764) loss:3.010 lr:0.0000010 epoch_Time:114.0min: [2023-12-11 13:43:49,047][model2_sft.py][INFO] Epoch:[1/2](33350/63764) loss:3.173 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 13:44:00,244][model2_sft.py][INFO] Epoch:[1/2](33400/63764) loss:3.176 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 13:44:11,482][model2_sft.py][INFO] Epoch:[1/2](33450/63764) loss:3.419 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 13:44:22,688][model2_sft.py][INFO] Epoch:[1/2](33500/63764) loss:2.947 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 13:44:33,899][model2_sft.py][INFO] Epoch:[1/2](33550/63764) loss:2.846 lr:0.0000010 epoch_Time:113.0min: [2023-12-11 13:44:45,076][model2_sft.py][INFO] Epoch:[1/2](33600/63764) loss:3.193 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 13:44:56,245][model2_sft.py][INFO] Epoch:[1/2](33650/63764) loss:2.699 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 13:45:07,404][model2_sft.py][INFO] Epoch:[1/2](33700/63764) loss:3.611 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 13:45:18,563][model2_sft.py][INFO] Epoch:[1/2](33750/63764) loss:2.620 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 13:45:29,743][model2_sft.py][INFO] Epoch:[1/2](33800/63764) loss:2.947 lr:0.0000010 epoch_Time:112.0min: [2023-12-11 13:45:40,911][model2_sft.py][INFO] Epoch:[1/2](33850/63764) loss:2.847 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 13:45:52,064][model2_sft.py][INFO] Epoch:[1/2](33900/63764) loss:3.379 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 13:46:03,217][model2_sft.py][INFO] Epoch:[1/2](33950/63764) loss:3.265 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 13:46:14,361][model2_sft.py][INFO] Epoch:[1/2](34000/63764) loss:3.113 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 13:46:25,496][model2_sft.py][INFO] Epoch:[1/2](34050/63764) loss:3.814 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 13:46:36,636][model2_sft.py][INFO] Epoch:[1/2](34100/63764) loss:3.329 lr:0.0000010 epoch_Time:111.0min: [2023-12-11 13:46:47,816][model2_sft.py][INFO] Epoch:[1/2](34150/63764) loss:3.321 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 13:46:58,936][model2_sft.py][INFO] Epoch:[1/2](34200/63764) loss:2.870 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 13:47:10,090][model2_sft.py][INFO] Epoch:[1/2](34250/63764) loss:2.836 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 13:47:21,217][model2_sft.py][INFO] Epoch:[1/2](34300/63764) loss:2.954 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 13:47:32,391][model2_sft.py][INFO] Epoch:[1/2](34350/63764) loss:2.728 lr:0.0000010 epoch_Time:110.0min: [2023-12-11 13:47:43,521][model2_sft.py][INFO] Epoch:[1/2](34400/63764) loss:3.638 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 13:47:54,698][model2_sft.py][INFO] Epoch:[1/2](34450/63764) loss:3.903 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 13:48:05,904][model2_sft.py][INFO] Epoch:[1/2](34500/63764) loss:2.813 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 13:48:16,987][model2_sft.py][INFO] Epoch:[1/2](34550/63764) loss:3.046 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 13:48:28,172][model2_sft.py][INFO] Epoch:[1/2](34600/63764) loss:3.390 lr:0.0000010 epoch_Time:109.0min: [2023-12-11 13:48:39,335][model2_sft.py][INFO] Epoch:[1/2](34650/63764) loss:3.143 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 13:48:50,448][model2_sft.py][INFO] Epoch:[1/2](34700/63764) loss:2.682 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 13:49:01,593][model2_sft.py][INFO] Epoch:[1/2](34750/63764) loss:3.781 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 13:49:12,791][model2_sft.py][INFO] Epoch:[1/2](34800/63764) loss:2.879 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 13:49:24,060][model2_sft.py][INFO] Epoch:[1/2](34850/63764) loss:3.416 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 13:49:35,213][model2_sft.py][INFO] Epoch:[1/2](34900/63764) loss:3.154 lr:0.0000010 epoch_Time:108.0min: [2023-12-11 13:49:46,328][model2_sft.py][INFO] Epoch:[1/2](34950/63764) loss:2.877 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 13:49:57,507][model2_sft.py][INFO] Epoch:[1/2](35000/63764) loss:3.052 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 13:50:08,646][model2_sft.py][INFO] Epoch:[1/2](35050/63764) loss:3.188 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 13:50:19,812][model2_sft.py][INFO] Epoch:[1/2](35100/63764) loss:3.311 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 13:50:30,939][model2_sft.py][INFO] Epoch:[1/2](35150/63764) loss:3.032 lr:0.0000010 epoch_Time:107.0min: [2023-12-11 13:50:42,042][model2_sft.py][INFO] Epoch:[1/2](35200/63764) loss:2.924 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 13:50:53,222][model2_sft.py][INFO] Epoch:[1/2](35250/63764) loss:2.870 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 13:51:04,411][model2_sft.py][INFO] Epoch:[1/2](35300/63764) loss:3.348 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 13:51:15,532][model2_sft.py][INFO] Epoch:[1/2](35350/63764) loss:3.483 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 13:51:26,671][model2_sft.py][INFO] Epoch:[1/2](35400/63764) loss:3.368 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 13:51:37,807][model2_sft.py][INFO] Epoch:[1/2](35450/63764) loss:3.640 lr:0.0000010 epoch_Time:106.0min: [2023-12-11 13:51:48,967][model2_sft.py][INFO] Epoch:[1/2](35500/63764) loss:3.737 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 13:52:00,134][model2_sft.py][INFO] Epoch:[1/2](35550/63764) loss:3.232 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 13:52:11,318][model2_sft.py][INFO] Epoch:[1/2](35600/63764) loss:2.975 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 13:52:22,471][model2_sft.py][INFO] Epoch:[1/2](35650/63764) loss:3.413 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 13:52:33,632][model2_sft.py][INFO] Epoch:[1/2](35700/63764) loss:2.707 lr:0.0000010 epoch_Time:105.0min: [2023-12-11 13:52:44,829][model2_sft.py][INFO] Epoch:[1/2](35750/63764) loss:3.199 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 13:52:55,971][model2_sft.py][INFO] Epoch:[1/2](35800/63764) loss:3.082 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 13:53:07,088][model2_sft.py][INFO] Epoch:[1/2](35850/63764) loss:3.452 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 13:53:18,236][model2_sft.py][INFO] Epoch:[1/2](35900/63764) loss:3.826 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 13:53:29,378][model2_sft.py][INFO] Epoch:[1/2](35950/63764) loss:3.042 lr:0.0000010 epoch_Time:104.0min: [2023-12-11 13:53:40,526][model2_sft.py][INFO] Epoch:[1/2](36000/63764) loss:2.693 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 13:53:51,724][model2_sft.py][INFO] Epoch:[1/2](36050/63764) loss:3.442 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 13:54:02,885][model2_sft.py][INFO] Epoch:[1/2](36100/63764) loss:3.216 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 13:54:13,995][model2_sft.py][INFO] Epoch:[1/2](36150/63764) loss:3.486 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 13:54:25,111][model2_sft.py][INFO] Epoch:[1/2](36200/63764) loss:3.555 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 13:54:36,249][model2_sft.py][INFO] Epoch:[1/2](36250/63764) loss:2.950 lr:0.0000010 epoch_Time:103.0min: [2023-12-11 13:54:47,416][model2_sft.py][INFO] Epoch:[1/2](36300/63764) loss:3.875 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 13:54:58,608][model2_sft.py][INFO] Epoch:[1/2](36350/63764) loss:2.890 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 13:55:09,761][model2_sft.py][INFO] Epoch:[1/2](36400/63764) loss:2.858 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 13:55:20,965][model2_sft.py][INFO] Epoch:[1/2](36450/63764) loss:3.566 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 13:55:32,151][model2_sft.py][INFO] Epoch:[1/2](36500/63764) loss:3.679 lr:0.0000010 epoch_Time:102.0min: [2023-12-11 13:55:43,334][model2_sft.py][INFO] Epoch:[1/2](36550/63764) loss:3.156 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 13:55:54,491][model2_sft.py][INFO] Epoch:[1/2](36600/63764) loss:3.259 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 13:56:05,664][model2_sft.py][INFO] Epoch:[1/2](36650/63764) loss:3.961 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 13:56:16,855][model2_sft.py][INFO] Epoch:[1/2](36700/63764) loss:3.140 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 13:56:28,037][model2_sft.py][INFO] Epoch:[1/2](36750/63764) loss:3.382 lr:0.0000010 epoch_Time:101.0min: [2023-12-11 13:56:39,249][model2_sft.py][INFO] Epoch:[1/2](36800/63764) loss:3.453 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 13:56:50,458][model2_sft.py][INFO] Epoch:[1/2](36850/63764) loss:2.951 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 13:57:01,658][model2_sft.py][INFO] Epoch:[1/2](36900/63764) loss:3.414 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 13:57:12,819][model2_sft.py][INFO] Epoch:[1/2](36950/63764) loss:3.294 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 13:57:23,988][model2_sft.py][INFO] Epoch:[1/2](37000/63764) loss:2.592 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 13:57:35,182][model2_sft.py][INFO] Epoch:[1/2](37050/63764) loss:3.700 lr:0.0000010 epoch_Time:100.0min: [2023-12-11 13:57:46,341][model2_sft.py][INFO] Epoch:[1/2](37100/63764) loss:3.596 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 13:57:57,530][model2_sft.py][INFO] Epoch:[1/2](37150/63764) loss:3.623 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 13:58:08,730][model2_sft.py][INFO] Epoch:[1/2](37200/63764) loss:3.469 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 13:58:19,853][model2_sft.py][INFO] Epoch:[1/2](37250/63764) loss:3.898 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 13:58:30,976][model2_sft.py][INFO] Epoch:[1/2](37300/63764) loss:2.867 lr:0.0000010 epoch_Time:99.0min: [2023-12-11 13:58:42,174][model2_sft.py][INFO] Epoch:[1/2](37350/63764) loss:3.126 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 13:58:53,350][model2_sft.py][INFO] Epoch:[1/2](37400/63764) loss:3.538 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 13:59:04,509][model2_sft.py][INFO] Epoch:[1/2](37450/63764) loss:3.134 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 13:59:15,692][model2_sft.py][INFO] Epoch:[1/2](37500/63764) loss:3.255 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 13:59:26,866][model2_sft.py][INFO] Epoch:[1/2](37550/63764) loss:3.104 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 13:59:38,034][model2_sft.py][INFO] Epoch:[1/2](37600/63764) loss:3.132 lr:0.0000010 epoch_Time:98.0min: [2023-12-11 13:59:49,211][model2_sft.py][INFO] Epoch:[1/2](37650/63764) loss:3.079 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 14:00:00,370][model2_sft.py][INFO] Epoch:[1/2](37700/63764) loss:3.030 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 14:00:11,523][model2_sft.py][INFO] Epoch:[1/2](37750/63764) loss:3.710 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 14:00:22,669][model2_sft.py][INFO] Epoch:[1/2](37800/63764) loss:2.758 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 14:00:33,868][model2_sft.py][INFO] Epoch:[1/2](37850/63764) loss:3.110 lr:0.0000010 epoch_Time:97.0min: [2023-12-11 14:00:45,000][model2_sft.py][INFO] Epoch:[1/2](37900/63764) loss:3.523 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 14:00:56,149][model2_sft.py][INFO] Epoch:[1/2](37950/63764) loss:2.679 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 14:01:07,284][model2_sft.py][INFO] Epoch:[1/2](38000/63764) loss:3.039 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 14:01:18,438][model2_sft.py][INFO] Epoch:[1/2](38050/63764) loss:2.797 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 14:01:29,633][model2_sft.py][INFO] Epoch:[1/2](38100/63764) loss:3.024 lr:0.0000010 epoch_Time:96.0min: [2023-12-11 14:01:40,793][model2_sft.py][INFO] Epoch:[1/2](38150/63764) loss:3.046 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 14:01:51,998][model2_sft.py][INFO] Epoch:[1/2](38200/63764) loss:2.772 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 14:02:03,130][model2_sft.py][INFO] Epoch:[1/2](38250/63764) loss:2.979 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 14:02:14,287][model2_sft.py][INFO] Epoch:[1/2](38300/63764) loss:3.262 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 14:02:25,469][model2_sft.py][INFO] Epoch:[1/2](38350/63764) loss:3.138 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 14:02:36,679][model2_sft.py][INFO] Epoch:[1/2](38400/63764) loss:3.995 lr:0.0000010 epoch_Time:95.0min: [2023-12-11 14:02:47,853][model2_sft.py][INFO] Epoch:[1/2](38450/63764) loss:2.687 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 14:02:59,015][model2_sft.py][INFO] Epoch:[1/2](38500/63764) loss:3.783 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 14:03:10,137][model2_sft.py][INFO] Epoch:[1/2](38550/63764) loss:3.360 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 14:03:21,305][model2_sft.py][INFO] Epoch:[1/2](38600/63764) loss:4.059 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 14:03:32,489][model2_sft.py][INFO] Epoch:[1/2](38650/63764) loss:3.753 lr:0.0000010 epoch_Time:94.0min: [2023-12-11 14:03:43,656][model2_sft.py][INFO] Epoch:[1/2](38700/63764) loss:3.106 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 14:03:54,833][model2_sft.py][INFO] Epoch:[1/2](38750/63764) loss:3.323 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 14:04:05,981][model2_sft.py][INFO] Epoch:[1/2](38800/63764) loss:2.815 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 14:04:17,138][model2_sft.py][INFO] Epoch:[1/2](38850/63764) loss:3.544 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 14:04:28,324][model2_sft.py][INFO] Epoch:[1/2](38900/63764) loss:2.980 lr:0.0000010 epoch_Time:93.0min: [2023-12-11 14:04:39,484][model2_sft.py][INFO] Epoch:[1/2](38950/63764) loss:3.431 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 14:04:50,597][model2_sft.py][INFO] Epoch:[1/2](39000/63764) loss:3.157 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 14:05:01,791][model2_sft.py][INFO] Epoch:[1/2](39050/63764) loss:3.602 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 14:05:12,884][model2_sft.py][INFO] Epoch:[1/2](39100/63764) loss:3.443 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 14:05:24,020][model2_sft.py][INFO] Epoch:[1/2](39150/63764) loss:2.881 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 14:05:35,195][model2_sft.py][INFO] Epoch:[1/2](39200/63764) loss:3.076 lr:0.0000010 epoch_Time:92.0min: [2023-12-11 14:05:46,398][model2_sft.py][INFO] Epoch:[1/2](39250/63764) loss:3.433 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 14:05:57,613][model2_sft.py][INFO] Epoch:[1/2](39300/63764) loss:3.605 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 14:06:08,835][model2_sft.py][INFO] Epoch:[1/2](39350/63764) loss:3.194 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 14:06:20,020][model2_sft.py][INFO] Epoch:[1/2](39400/63764) loss:3.747 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 14:06:31,211][model2_sft.py][INFO] Epoch:[1/2](39450/63764) loss:3.535 lr:0.0000010 epoch_Time:91.0min: [2023-12-11 14:06:42,424][model2_sft.py][INFO] Epoch:[1/2](39500/63764) loss:3.748 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 14:06:53,618][model2_sft.py][INFO] Epoch:[1/2](39550/63764) loss:3.341 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 14:07:04,764][model2_sft.py][INFO] Epoch:[1/2](39600/63764) loss:3.457 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 14:07:15,932][model2_sft.py][INFO] Epoch:[1/2](39650/63764) loss:3.134 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 14:07:27,088][model2_sft.py][INFO] Epoch:[1/2](39700/63764) loss:2.562 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 14:07:38,241][model2_sft.py][INFO] Epoch:[1/2](39750/63764) loss:3.326 lr:0.0000010 epoch_Time:90.0min: [2023-12-11 14:07:49,407][model2_sft.py][INFO] Epoch:[1/2](39800/63764) loss:2.867 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 14:08:00,615][model2_sft.py][INFO] Epoch:[1/2](39850/63764) loss:3.102 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 14:08:11,823][model2_sft.py][INFO] Epoch:[1/2](39900/63764) loss:2.955 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 14:08:22,997][model2_sft.py][INFO] Epoch:[1/2](39950/63764) loss:3.159 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 14:08:34,240][model2_sft.py][INFO] Epoch:[1/2](40000/63764) loss:3.548 lr:0.0000010 epoch_Time:89.0min: [2023-12-11 14:08:45,438][model2_sft.py][INFO] Epoch:[1/2](40050/63764) loss:3.288 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 14:08:56,618][model2_sft.py][INFO] Epoch:[1/2](40100/63764) loss:2.866 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 14:09:07,816][model2_sft.py][INFO] Epoch:[1/2](40150/63764) loss:3.661 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 14:09:19,079][model2_sft.py][INFO] Epoch:[1/2](40200/63764) loss:3.204 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 14:09:30,241][model2_sft.py][INFO] Epoch:[1/2](40250/63764) loss:2.877 lr:0.0000010 epoch_Time:88.0min: [2023-12-11 14:09:41,417][model2_sft.py][INFO] Epoch:[1/2](40300/63764) loss:3.712 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 14:09:52,619][model2_sft.py][INFO] Epoch:[1/2](40350/63764) loss:3.332 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 14:10:03,809][model2_sft.py][INFO] Epoch:[1/2](40400/63764) loss:2.987 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 14:10:15,007][model2_sft.py][INFO] Epoch:[1/2](40450/63764) loss:3.258 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 14:10:26,229][model2_sft.py][INFO] Epoch:[1/2](40500/63764) loss:3.777 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 14:10:37,378][model2_sft.py][INFO] Epoch:[1/2](40550/63764) loss:3.331 lr:0.0000010 epoch_Time:87.0min: [2023-12-11 14:10:48,570][model2_sft.py][INFO] Epoch:[1/2](40600/63764) loss:3.299 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 14:10:59,783][model2_sft.py][INFO] Epoch:[1/2](40650/63764) loss:2.776 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 14:11:10,943][model2_sft.py][INFO] Epoch:[1/2](40700/63764) loss:3.561 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 14:11:22,109][model2_sft.py][INFO] Epoch:[1/2](40750/63764) loss:3.119 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 14:11:33,350][model2_sft.py][INFO] Epoch:[1/2](40800/63764) loss:2.778 lr:0.0000010 epoch_Time:86.0min: [2023-12-11 14:11:44,536][model2_sft.py][INFO] Epoch:[1/2](40850/63764) loss:3.641 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 14:11:55,701][model2_sft.py][INFO] Epoch:[1/2](40900/63764) loss:3.784 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 14:12:06,867][model2_sft.py][INFO] Epoch:[1/2](40950/63764) loss:3.621 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 14:12:18,050][model2_sft.py][INFO] Epoch:[1/2](41000/63764) loss:3.516 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 14:12:29,203][model2_sft.py][INFO] Epoch:[1/2](41050/63764) loss:3.234 lr:0.0000010 epoch_Time:85.0min: [2023-12-11 14:12:40,382][model2_sft.py][INFO] Epoch:[1/2](41100/63764) loss:3.658 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 14:12:51,579][model2_sft.py][INFO] Epoch:[1/2](41150/63764) loss:3.916 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 14:13:02,778][model2_sft.py][INFO] Epoch:[1/2](41200/63764) loss:3.450 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 14:13:13,982][model2_sft.py][INFO] Epoch:[1/2](41250/63764) loss:3.408 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 14:13:25,184][model2_sft.py][INFO] Epoch:[1/2](41300/63764) loss:3.007 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 14:13:36,374][model2_sft.py][INFO] Epoch:[1/2](41350/63764) loss:3.495 lr:0.0000010 epoch_Time:84.0min: [2023-12-11 14:13:47,534][model2_sft.py][INFO] Epoch:[1/2](41400/63764) loss:2.604 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 14:13:58,754][model2_sft.py][INFO] Epoch:[1/2](41450/63764) loss:3.259 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 14:14:09,952][model2_sft.py][INFO] Epoch:[1/2](41500/63764) loss:3.688 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 14:14:21,208][model2_sft.py][INFO] Epoch:[1/2](41550/63764) loss:3.916 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 14:14:32,408][model2_sft.py][INFO] Epoch:[1/2](41600/63764) loss:2.963 lr:0.0000010 epoch_Time:83.0min: [2023-12-11 14:14:43,568][model2_sft.py][INFO] Epoch:[1/2](41650/63764) loss:3.736 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 14:14:54,733][model2_sft.py][INFO] Epoch:[1/2](41700/63764) loss:2.616 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 14:15:05,912][model2_sft.py][INFO] Epoch:[1/2](41750/63764) loss:3.719 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 14:15:17,122][model2_sft.py][INFO] Epoch:[1/2](41800/63764) loss:3.277 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 14:15:28,286][model2_sft.py][INFO] Epoch:[1/2](41850/63764) loss:3.345 lr:0.0000010 epoch_Time:82.0min: [2023-12-11 14:15:39,476][model2_sft.py][INFO] Epoch:[1/2](41900/63764) loss:3.120 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 14:15:50,655][model2_sft.py][INFO] Epoch:[1/2](41950/63764) loss:3.002 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 14:16:01,815][model2_sft.py][INFO] Epoch:[1/2](42000/63764) loss:3.372 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 14:16:13,031][model2_sft.py][INFO] Epoch:[1/2](42050/63764) loss:3.371 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 14:16:24,173][model2_sft.py][INFO] Epoch:[1/2](42100/63764) loss:3.983 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 14:16:35,307][model2_sft.py][INFO] Epoch:[1/2](42150/63764) loss:3.142 lr:0.0000010 epoch_Time:81.0min: [2023-12-11 14:16:46,448][model2_sft.py][INFO] Epoch:[1/2](42200/63764) loss:3.187 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 14:16:57,633][model2_sft.py][INFO] Epoch:[1/2](42250/63764) loss:3.219 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 14:17:08,827][model2_sft.py][INFO] Epoch:[1/2](42300/63764) loss:3.306 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 14:17:19,976][model2_sft.py][INFO] Epoch:[1/2](42350/63764) loss:3.814 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 14:17:31,151][model2_sft.py][INFO] Epoch:[1/2](42400/63764) loss:2.547 lr:0.0000010 epoch_Time:80.0min: [2023-12-11 14:17:42,280][model2_sft.py][INFO] Epoch:[1/2](42450/63764) loss:2.726 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 14:17:53,462][model2_sft.py][INFO] Epoch:[1/2](42500/63764) loss:3.237 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 14:18:04,657][model2_sft.py][INFO] Epoch:[1/2](42550/63764) loss:3.626 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 14:18:15,830][model2_sft.py][INFO] Epoch:[1/2](42600/63764) loss:3.446 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 14:18:26,942][model2_sft.py][INFO] Epoch:[1/2](42650/63764) loss:3.033 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 14:18:38,067][model2_sft.py][INFO] Epoch:[1/2](42700/63764) loss:3.482 lr:0.0000010 epoch_Time:79.0min: [2023-12-11 14:18:49,251][model2_sft.py][INFO] Epoch:[1/2](42750/63764) loss:3.278 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 14:19:00,433][model2_sft.py][INFO] Epoch:[1/2](42800/63764) loss:3.538 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 14:19:11,604][model2_sft.py][INFO] Epoch:[1/2](42850/63764) loss:2.965 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 14:19:22,760][model2_sft.py][INFO] Epoch:[1/2](42900/63764) loss:4.055 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 14:19:33,892][model2_sft.py][INFO] Epoch:[1/2](42950/63764) loss:3.410 lr:0.0000010 epoch_Time:78.0min: [2023-12-11 14:19:45,033][model2_sft.py][INFO] Epoch:[1/2](43000/63764) loss:2.825 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 14:19:56,191][model2_sft.py][INFO] Epoch:[1/2](43050/63764) loss:3.285 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 14:20:07,398][model2_sft.py][INFO] Epoch:[1/2](43100/63764) loss:2.752 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 14:20:18,558][model2_sft.py][INFO] Epoch:[1/2](43150/63764) loss:3.142 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 14:20:29,731][model2_sft.py][INFO] Epoch:[1/2](43200/63764) loss:3.649 lr:0.0000010 epoch_Time:77.0min: [2023-12-11 14:20:40,938][model2_sft.py][INFO] Epoch:[1/2](43250/63764) loss:3.309 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 14:20:52,096][model2_sft.py][INFO] Epoch:[1/2](43300/63764) loss:2.813 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 14:21:03,299][model2_sft.py][INFO] Epoch:[1/2](43350/63764) loss:3.462 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 14:21:14,405][model2_sft.py][INFO] Epoch:[1/2](43400/63764) loss:3.380 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 14:21:25,553][model2_sft.py][INFO] Epoch:[1/2](43450/63764) loss:3.662 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 14:21:36,732][model2_sft.py][INFO] Epoch:[1/2](43500/63764) loss:3.684 lr:0.0000010 epoch_Time:76.0min: [2023-12-11 14:21:47,940][model2_sft.py][INFO] Epoch:[1/2](43550/63764) loss:3.900 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 14:21:59,102][model2_sft.py][INFO] Epoch:[1/2](43600/63764) loss:3.914 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 14:22:10,253][model2_sft.py][INFO] Epoch:[1/2](43650/63764) loss:3.870 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 14:22:21,520][model2_sft.py][INFO] Epoch:[1/2](43700/63764) loss:3.238 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 14:22:32,706][model2_sft.py][INFO] Epoch:[1/2](43750/63764) loss:2.895 lr:0.0000010 epoch_Time:75.0min: [2023-12-11 14:22:43,893][model2_sft.py][INFO] Epoch:[1/2](43800/63764) loss:3.351 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 14:22:55,091][model2_sft.py][INFO] Epoch:[1/2](43850/63764) loss:3.308 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 14:23:06,248][model2_sft.py][INFO] Epoch:[1/2](43900/63764) loss:3.209 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 14:23:17,427][model2_sft.py][INFO] Epoch:[1/2](43950/63764) loss:3.733 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 14:23:28,617][model2_sft.py][INFO] Epoch:[1/2](44000/63764) loss:3.294 lr:0.0000010 epoch_Time:74.0min: [2023-12-11 14:23:39,806][model2_sft.py][INFO] Epoch:[1/2](44050/63764) loss:3.683 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 14:23:50,943][model2_sft.py][INFO] Epoch:[1/2](44100/63764) loss:3.737 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 14:24:02,099][model2_sft.py][INFO] Epoch:[1/2](44150/63764) loss:3.228 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 14:24:13,283][model2_sft.py][INFO] Epoch:[1/2](44200/63764) loss:2.599 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 14:24:24,503][model2_sft.py][INFO] Epoch:[1/2](44250/63764) loss:2.852 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 14:24:35,637][model2_sft.py][INFO] Epoch:[1/2](44300/63764) loss:3.662 lr:0.0000010 epoch_Time:73.0min: [2023-12-11 14:24:46,804][model2_sft.py][INFO] Epoch:[1/2](44350/63764) loss:3.581 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 14:24:57,959][model2_sft.py][INFO] Epoch:[1/2](44400/63764) loss:3.377 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 14:25:09,148][model2_sft.py][INFO] Epoch:[1/2](44450/63764) loss:3.474 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 14:25:20,297][model2_sft.py][INFO] Epoch:[1/2](44500/63764) loss:3.756 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 14:25:31,455][model2_sft.py][INFO] Epoch:[1/2](44550/63764) loss:2.977 lr:0.0000010 epoch_Time:72.0min: [2023-12-11 14:25:42,600][model2_sft.py][INFO] Epoch:[1/2](44600/63764) loss:4.043 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 14:25:53,742][model2_sft.py][INFO] Epoch:[1/2](44650/63764) loss:2.893 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 14:26:04,914][model2_sft.py][INFO] Epoch:[1/2](44700/63764) loss:2.978 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 14:26:16,052][model2_sft.py][INFO] Epoch:[1/2](44750/63764) loss:3.621 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 14:26:27,197][model2_sft.py][INFO] Epoch:[1/2](44800/63764) loss:2.931 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 14:26:38,318][model2_sft.py][INFO] Epoch:[1/2](44850/63764) loss:3.295 lr:0.0000010 epoch_Time:71.0min: [2023-12-11 14:26:49,499][model2_sft.py][INFO] Epoch:[1/2](44900/63764) loss:4.155 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 14:27:00,615][model2_sft.py][INFO] Epoch:[1/2](44950/63764) loss:3.885 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 14:27:11,757][model2_sft.py][INFO] Epoch:[1/2](45000/63764) loss:3.667 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 14:27:22,885][model2_sft.py][INFO] Epoch:[1/2](45050/63764) loss:2.806 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 14:27:34,085][model2_sft.py][INFO] Epoch:[1/2](45100/63764) loss:3.727 lr:0.0000010 epoch_Time:70.0min: [2023-12-11 14:27:45,235][model2_sft.py][INFO] Epoch:[1/2](45150/63764) loss:3.240 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 14:27:56,466][model2_sft.py][INFO] Epoch:[1/2](45200/63764) loss:3.387 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 14:28:07,627][model2_sft.py][INFO] Epoch:[1/2](45250/63764) loss:3.377 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 14:28:18,766][model2_sft.py][INFO] Epoch:[1/2](45300/63764) loss:3.628 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 14:28:29,912][model2_sft.py][INFO] Epoch:[1/2](45350/63764) loss:3.446 lr:0.0000010 epoch_Time:69.0min: [2023-12-11 14:28:41,104][model2_sft.py][INFO] Epoch:[1/2](45400/63764) loss:2.677 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 14:28:52,287][model2_sft.py][INFO] Epoch:[1/2](45450/63764) loss:3.539 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 14:29:03,469][model2_sft.py][INFO] Epoch:[1/2](45500/63764) loss:3.311 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 14:29:14,625][model2_sft.py][INFO] Epoch:[1/2](45550/63764) loss:3.464 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 14:29:25,778][model2_sft.py][INFO] Epoch:[1/2](45600/63764) loss:3.301 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 14:29:37,045][model2_sft.py][INFO] Epoch:[1/2](45650/63764) loss:3.091 lr:0.0000010 epoch_Time:68.0min: [2023-12-11 14:29:48,227][model2_sft.py][INFO] Epoch:[1/2](45700/63764) loss:3.116 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 14:29:59,377][model2_sft.py][INFO] Epoch:[1/2](45750/63764) loss:3.662 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 14:30:10,573][model2_sft.py][INFO] Epoch:[1/2](45800/63764) loss:3.331 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 14:30:21,723][model2_sft.py][INFO] Epoch:[1/2](45850/63764) loss:3.141 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 14:30:32,848][model2_sft.py][INFO] Epoch:[1/2](45900/63764) loss:3.289 lr:0.0000010 epoch_Time:67.0min: [2023-12-11 14:30:44,058][model2_sft.py][INFO] Epoch:[1/2](45950/63764) loss:3.425 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 14:30:55,229][model2_sft.py][INFO] Epoch:[1/2](46000/63764) loss:3.005 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 14:31:06,428][model2_sft.py][INFO] Epoch:[1/2](46050/63764) loss:3.241 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 14:31:17,655][model2_sft.py][INFO] Epoch:[1/2](46100/63764) loss:3.583 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 14:31:28,814][model2_sft.py][INFO] Epoch:[1/2](46150/63764) loss:3.152 lr:0.0000010 epoch_Time:66.0min: [2023-12-11 14:31:39,975][model2_sft.py][INFO] Epoch:[1/2](46200/63764) loss:3.142 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 14:31:51,153][model2_sft.py][INFO] Epoch:[1/2](46250/63764) loss:3.919 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 14:32:02,301][model2_sft.py][INFO] Epoch:[1/2](46300/63764) loss:3.140 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 14:32:13,520][model2_sft.py][INFO] Epoch:[1/2](46350/63764) loss:3.196 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 14:32:24,716][model2_sft.py][INFO] Epoch:[1/2](46400/63764) loss:3.613 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 14:32:35,856][model2_sft.py][INFO] Epoch:[1/2](46450/63764) loss:2.845 lr:0.0000010 epoch_Time:65.0min: [2023-12-11 14:32:46,986][model2_sft.py][INFO] Epoch:[1/2](46500/63764) loss:2.856 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 14:32:58,154][model2_sft.py][INFO] Epoch:[1/2](46550/63764) loss:2.526 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 14:33:09,312][model2_sft.py][INFO] Epoch:[1/2](46600/63764) loss:3.236 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 14:33:20,469][model2_sft.py][INFO] Epoch:[1/2](46650/63764) loss:3.313 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 14:33:31,661][model2_sft.py][INFO] Epoch:[1/2](46700/63764) loss:3.332 lr:0.0000010 epoch_Time:64.0min: [2023-12-11 14:33:42,794][model2_sft.py][INFO] Epoch:[1/2](46750/63764) loss:3.500 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 14:33:53,970][model2_sft.py][INFO] Epoch:[1/2](46800/63764) loss:2.791 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 14:34:05,116][model2_sft.py][INFO] Epoch:[1/2](46850/63764) loss:3.052 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 14:34:16,271][model2_sft.py][INFO] Epoch:[1/2](46900/63764) loss:2.968 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 14:34:27,483][model2_sft.py][INFO] Epoch:[1/2](46950/63764) loss:3.220 lr:0.0000010 epoch_Time:63.0min: [2023-12-11 14:34:38,619][model2_sft.py][INFO] Epoch:[1/2](47000/63764) loss:3.324 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 14:34:49,791][model2_sft.py][INFO] Epoch:[1/2](47050/63764) loss:3.074 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 14:35:01,028][model2_sft.py][INFO] Epoch:[1/2](47100/63764) loss:3.765 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 14:35:12,198][model2_sft.py][INFO] Epoch:[1/2](47150/63764) loss:3.486 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 14:35:23,350][model2_sft.py][INFO] Epoch:[1/2](47200/63764) loss:3.497 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 14:35:34,537][model2_sft.py][INFO] Epoch:[1/2](47250/63764) loss:4.121 lr:0.0000010 epoch_Time:62.0min: [2023-12-11 14:35:45,713][model2_sft.py][INFO] Epoch:[1/2](47300/63764) loss:2.971 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 14:35:56,894][model2_sft.py][INFO] Epoch:[1/2](47350/63764) loss:2.837 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 14:36:08,073][model2_sft.py][INFO] Epoch:[1/2](47400/63764) loss:3.274 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 14:36:19,248][model2_sft.py][INFO] Epoch:[1/2](47450/63764) loss:2.795 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 14:36:30,409][model2_sft.py][INFO] Epoch:[1/2](47500/63764) loss:2.954 lr:0.0000010 epoch_Time:61.0min: [2023-12-11 14:36:41,579][model2_sft.py][INFO] Epoch:[1/2](47550/63764) loss:3.031 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 14:36:52,708][model2_sft.py][INFO] Epoch:[1/2](47600/63764) loss:3.860 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 14:37:03,861][model2_sft.py][INFO] Epoch:[1/2](47650/63764) loss:3.372 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 14:37:15,024][model2_sft.py][INFO] Epoch:[1/2](47700/63764) loss:2.902 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 14:37:26,186][model2_sft.py][INFO] Epoch:[1/2](47750/63764) loss:2.873 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 14:37:37,325][model2_sft.py][INFO] Epoch:[1/2](47800/63764) loss:4.081 lr:0.0000010 epoch_Time:60.0min: [2023-12-11 14:37:48,503][model2_sft.py][INFO] Epoch:[1/2](47850/63764) loss:3.910 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 14:37:59,644][model2_sft.py][INFO] Epoch:[1/2](47900/63764) loss:2.843 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 14:38:10,808][model2_sft.py][INFO] Epoch:[1/2](47950/63764) loss:3.012 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 14:38:21,943][model2_sft.py][INFO] Epoch:[1/2](48000/63764) loss:3.086 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 14:38:33,115][model2_sft.py][INFO] Epoch:[1/2](48050/63764) loss:3.416 lr:0.0000010 epoch_Time:59.0min: [2023-12-11 14:38:44,250][model2_sft.py][INFO] Epoch:[1/2](48100/63764) loss:3.733 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 14:38:55,421][model2_sft.py][INFO] Epoch:[1/2](48150/63764) loss:3.700 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 14:39:06,569][model2_sft.py][INFO] Epoch:[1/2](48200/63764) loss:3.435 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 14:39:17,737][model2_sft.py][INFO] Epoch:[1/2](48250/63764) loss:2.728 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 14:39:28,900][model2_sft.py][INFO] Epoch:[1/2](48300/63764) loss:3.256 lr:0.0000010 epoch_Time:58.0min: [2023-12-11 14:39:40,021][model2_sft.py][INFO] Epoch:[1/2](48350/63764) loss:3.272 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 14:39:51,177][model2_sft.py][INFO] Epoch:[1/2](48400/63764) loss:3.379 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 14:40:02,374][model2_sft.py][INFO] Epoch:[1/2](48450/63764) loss:2.919 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 14:40:13,516][model2_sft.py][INFO] Epoch:[1/2](48500/63764) loss:3.419 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 14:40:24,655][model2_sft.py][INFO] Epoch:[1/2](48550/63764) loss:3.551 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 14:40:35,862][model2_sft.py][INFO] Epoch:[1/2](48600/63764) loss:3.025 lr:0.0000010 epoch_Time:57.0min: [2023-12-11 14:40:47,071][model2_sft.py][INFO] Epoch:[1/2](48650/63764) loss:3.680 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 14:40:58,213][model2_sft.py][INFO] Epoch:[1/2](48700/63764) loss:3.582 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 14:41:09,368][model2_sft.py][INFO] Epoch:[1/2](48750/63764) loss:3.660 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 14:41:20,497][model2_sft.py][INFO] Epoch:[1/2](48800/63764) loss:3.899 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 14:41:31,677][model2_sft.py][INFO] Epoch:[1/2](48850/63764) loss:4.084 lr:0.0000010 epoch_Time:56.0min: [2023-12-11 14:41:42,868][model2_sft.py][INFO] Epoch:[1/2](48900/63764) loss:3.055 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 14:41:53,970][model2_sft.py][INFO] Epoch:[1/2](48950/63764) loss:3.846 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 14:42:05,101][model2_sft.py][INFO] Epoch:[1/2](49000/63764) loss:3.448 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 14:42:16,225][model2_sft.py][INFO] Epoch:[1/2](49050/63764) loss:3.991 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 14:42:27,396][model2_sft.py][INFO] Epoch:[1/2](49100/63764) loss:3.977 lr:0.0000010 epoch_Time:55.0min: [2023-12-11 14:42:38,558][model2_sft.py][INFO] Epoch:[1/2](49150/63764) loss:3.547 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 14:42:49,696][model2_sft.py][INFO] Epoch:[1/2](49200/63764) loss:3.510 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 14:43:00,851][model2_sft.py][INFO] Epoch:[1/2](49250/63764) loss:3.084 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 14:43:12,008][model2_sft.py][INFO] Epoch:[1/2](49300/63764) loss:2.904 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 14:43:23,123][model2_sft.py][INFO] Epoch:[1/2](49350/63764) loss:3.333 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 14:43:34,290][model2_sft.py][INFO] Epoch:[1/2](49400/63764) loss:3.520 lr:0.0000010 epoch_Time:54.0min: [2023-12-11 14:43:45,416][model2_sft.py][INFO] Epoch:[1/2](49450/63764) loss:3.713 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 14:43:56,573][model2_sft.py][INFO] Epoch:[1/2](49500/63764) loss:3.276 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 14:44:07,686][model2_sft.py][INFO] Epoch:[1/2](49550/63764) loss:3.077 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 14:44:18,807][model2_sft.py][INFO] Epoch:[1/2](49600/63764) loss:3.132 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 14:44:29,997][model2_sft.py][INFO] Epoch:[1/2](49650/63764) loss:2.554 lr:0.0000010 epoch_Time:53.0min: [2023-12-11 14:44:41,137][model2_sft.py][INFO] Epoch:[1/2](49700/63764) loss:3.398 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 14:44:52,275][model2_sft.py][INFO] Epoch:[1/2](49750/63764) loss:3.330 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 14:45:03,420][model2_sft.py][INFO] Epoch:[1/2](49800/63764) loss:3.565 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 14:45:14,560][model2_sft.py][INFO] Epoch:[1/2](49850/63764) loss:3.898 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 14:45:25,745][model2_sft.py][INFO] Epoch:[1/2](49900/63764) loss:2.796 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 14:45:36,899][model2_sft.py][INFO] Epoch:[1/2](49950/63764) loss:3.314 lr:0.0000010 epoch_Time:52.0min: [2023-12-11 14:45:48,045][model2_sft.py][INFO] Epoch:[1/2](50000/63764) loss:3.420 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 14:45:59,128][model2_sft.py][INFO] Epoch:[1/2](50050/63764) loss:3.038 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 14:46:10,322][model2_sft.py][INFO] Epoch:[1/2](50100/63764) loss:3.793 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 14:46:21,540][model2_sft.py][INFO] Epoch:[1/2](50150/63764) loss:3.555 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 14:46:32,693][model2_sft.py][INFO] Epoch:[1/2](50200/63764) loss:3.572 lr:0.0000010 epoch_Time:51.0min: [2023-12-11 14:46:43,893][model2_sft.py][INFO] Epoch:[1/2](50250/63764) loss:3.215 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 14:46:55,094][model2_sft.py][INFO] Epoch:[1/2](50300/63764) loss:3.718 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 14:47:06,266][model2_sft.py][INFO] Epoch:[1/2](50350/63764) loss:3.336 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 14:47:17,419][model2_sft.py][INFO] Epoch:[1/2](50400/63764) loss:4.016 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 14:47:28,539][model2_sft.py][INFO] Epoch:[1/2](50450/63764) loss:3.028 lr:0.0000010 epoch_Time:50.0min: [2023-12-11 14:47:39,656][model2_sft.py][INFO] Epoch:[1/2](50500/63764) loss:3.833 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 14:47:50,818][model2_sft.py][INFO] Epoch:[1/2](50550/63764) loss:3.734 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 14:48:01,989][model2_sft.py][INFO] Epoch:[1/2](50600/63764) loss:3.261 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 14:48:13,124][model2_sft.py][INFO] Epoch:[1/2](50650/63764) loss:3.638 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 14:48:24,270][model2_sft.py][INFO] Epoch:[1/2](50700/63764) loss:3.377 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 14:48:35,420][model2_sft.py][INFO] Epoch:[1/2](50750/63764) loss:4.019 lr:0.0000010 epoch_Time:49.0min: [2023-12-11 14:48:46,585][model2_sft.py][INFO] Epoch:[1/2](50800/63764) loss:3.719 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 14:48:57,727][model2_sft.py][INFO] Epoch:[1/2](50850/63764) loss:3.090 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 14:49:08,882][model2_sft.py][INFO] Epoch:[1/2](50900/63764) loss:3.502 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 14:49:20,050][model2_sft.py][INFO] Epoch:[1/2](50950/63764) loss:3.102 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 14:49:31,204][model2_sft.py][INFO] Epoch:[1/2](51000/63764) loss:4.263 lr:0.0000010 epoch_Time:48.0min: [2023-12-11 14:49:42,356][model2_sft.py][INFO] Epoch:[1/2](51050/63764) loss:3.194 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 14:49:53,522][model2_sft.py][INFO] Epoch:[1/2](51100/63764) loss:3.226 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 14:50:04,618][model2_sft.py][INFO] Epoch:[1/2](51150/63764) loss:2.722 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 14:50:15,784][model2_sft.py][INFO] Epoch:[1/2](51200/63764) loss:3.244 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 14:50:26,912][model2_sft.py][INFO] Epoch:[1/2](51250/63764) loss:3.156 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 14:50:38,089][model2_sft.py][INFO] Epoch:[1/2](51300/63764) loss:3.409 lr:0.0000010 epoch_Time:47.0min: [2023-12-11 14:50:49,274][model2_sft.py][INFO] Epoch:[1/2](51350/63764) loss:3.369 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 14:51:00,381][model2_sft.py][INFO] Epoch:[1/2](51400/63764) loss:3.553 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 14:51:11,615][model2_sft.py][INFO] Epoch:[1/2](51450/63764) loss:2.904 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 14:51:22,758][model2_sft.py][INFO] Epoch:[1/2](51500/63764) loss:3.640 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 14:51:33,854][model2_sft.py][INFO] Epoch:[1/2](51550/63764) loss:2.701 lr:0.0000010 epoch_Time:46.0min: [2023-12-11 14:51:45,026][model2_sft.py][INFO] Epoch:[1/2](51600/63764) loss:3.175 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 14:51:56,161][model2_sft.py][INFO] Epoch:[1/2](51650/63764) loss:3.091 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 14:52:07,321][model2_sft.py][INFO] Epoch:[1/2](51700/63764) loss:3.362 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 14:52:18,453][model2_sft.py][INFO] Epoch:[1/2](51750/63764) loss:4.017 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 14:52:29,629][model2_sft.py][INFO] Epoch:[1/2](51800/63764) loss:3.238 lr:0.0000010 epoch_Time:45.0min: [2023-12-11 14:52:40,786][model2_sft.py][INFO] Epoch:[1/2](51850/63764) loss:3.607 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 14:52:51,914][model2_sft.py][INFO] Epoch:[1/2](51900/63764) loss:3.403 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 14:53:03,056][model2_sft.py][INFO] Epoch:[1/2](51950/63764) loss:3.403 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 14:53:14,214][model2_sft.py][INFO] Epoch:[1/2](52000/63764) loss:3.315 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 14:53:25,322][model2_sft.py][INFO] Epoch:[1/2](52050/63764) loss:2.301 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 14:53:36,438][model2_sft.py][INFO] Epoch:[1/2](52100/63764) loss:2.946 lr:0.0000010 epoch_Time:44.0min: [2023-12-11 14:53:47,568][model2_sft.py][INFO] Epoch:[1/2](52150/63764) loss:3.226 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 14:53:58,708][model2_sft.py][INFO] Epoch:[1/2](52200/63764) loss:3.146 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 14:54:09,862][model2_sft.py][INFO] Epoch:[1/2](52250/63764) loss:3.630 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 14:54:21,016][model2_sft.py][INFO] Epoch:[1/2](52300/63764) loss:3.030 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 14:54:32,145][model2_sft.py][INFO] Epoch:[1/2](52350/63764) loss:3.097 lr:0.0000010 epoch_Time:43.0min: [2023-12-11 14:54:43,268][model2_sft.py][INFO] Epoch:[1/2](52400/63764) loss:3.058 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 14:54:54,458][model2_sft.py][INFO] Epoch:[1/2](52450/63764) loss:3.488 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 14:55:05,599][model2_sft.py][INFO] Epoch:[1/2](52500/63764) loss:2.523 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 14:55:16,786][model2_sft.py][INFO] Epoch:[1/2](52550/63764) loss:3.700 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 14:55:27,944][model2_sft.py][INFO] Epoch:[1/2](52600/63764) loss:2.897 lr:0.0000010 epoch_Time:42.0min: [2023-12-11 14:55:39,098][model2_sft.py][INFO] Epoch:[1/2](52650/63764) loss:3.266 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 14:55:50,281][model2_sft.py][INFO] Epoch:[1/2](52700/63764) loss:3.378 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 14:56:01,449][model2_sft.py][INFO] Epoch:[1/2](52750/63764) loss:3.145 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 14:56:12,621][model2_sft.py][INFO] Epoch:[1/2](52800/63764) loss:3.355 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 14:56:23,792][model2_sft.py][INFO] Epoch:[1/2](52850/63764) loss:3.361 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 14:56:34,947][model2_sft.py][INFO] Epoch:[1/2](52900/63764) loss:3.559 lr:0.0000010 epoch_Time:41.0min: [2023-12-11 14:56:46,102][model2_sft.py][INFO] Epoch:[1/2](52950/63764) loss:4.674 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 14:56:57,295][model2_sft.py][INFO] Epoch:[1/2](53000/63764) loss:3.820 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 14:57:08,489][model2_sft.py][INFO] Epoch:[1/2](53050/63764) loss:3.885 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 14:57:19,660][model2_sft.py][INFO] Epoch:[1/2](53100/63764) loss:3.501 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 14:57:30,852][model2_sft.py][INFO] Epoch:[1/2](53150/63764) loss:3.535 lr:0.0000010 epoch_Time:40.0min: [2023-12-11 14:57:42,029][model2_sft.py][INFO] Epoch:[1/2](53200/63764) loss:3.128 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 14:57:53,224][model2_sft.py][INFO] Epoch:[1/2](53250/63764) loss:2.830 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 14:58:04,364][model2_sft.py][INFO] Epoch:[1/2](53300/63764) loss:3.184 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 14:58:15,516][model2_sft.py][INFO] Epoch:[1/2](53350/63764) loss:3.140 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 14:58:26,663][model2_sft.py][INFO] Epoch:[1/2](53400/63764) loss:4.108 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 14:58:37,799][model2_sft.py][INFO] Epoch:[1/2](53450/63764) loss:2.854 lr:0.0000010 epoch_Time:39.0min: [2023-12-11 14:58:48,923][model2_sft.py][INFO] Epoch:[1/2](53500/63764) loss:3.620 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 14:59:00,102][model2_sft.py][INFO] Epoch:[1/2](53550/63764) loss:3.594 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 14:59:11,209][model2_sft.py][INFO] Epoch:[1/2](53600/63764) loss:3.793 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 14:59:22,362][model2_sft.py][INFO] Epoch:[1/2](53650/63764) loss:3.958 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 14:59:33,498][model2_sft.py][INFO] Epoch:[1/2](53700/63764) loss:3.366 lr:0.0000010 epoch_Time:38.0min: [2023-12-11 14:59:44,662][model2_sft.py][INFO] Epoch:[1/2](53750/63764) loss:4.073 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 14:59:55,800][model2_sft.py][INFO] Epoch:[1/2](53800/63764) loss:3.650 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 15:00:06,941][model2_sft.py][INFO] Epoch:[1/2](53850/63764) loss:3.600 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 15:00:18,060][model2_sft.py][INFO] Epoch:[1/2](53900/63764) loss:3.419 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 15:00:29,223][model2_sft.py][INFO] Epoch:[1/2](53950/63764) loss:3.965 lr:0.0000010 epoch_Time:37.0min: [2023-12-11 15:00:40,357][model2_sft.py][INFO] Epoch:[1/2](54000/63764) loss:3.283 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 15:00:51,566][model2_sft.py][INFO] Epoch:[1/2](54050/63764) loss:3.679 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 15:01:02,710][model2_sft.py][INFO] Epoch:[1/2](54100/63764) loss:3.167 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 15:01:13,871][model2_sft.py][INFO] Epoch:[1/2](54150/63764) loss:3.495 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 15:01:25,040][model2_sft.py][INFO] Epoch:[1/2](54200/63764) loss:2.934 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 15:01:36,193][model2_sft.py][INFO] Epoch:[1/2](54250/63764) loss:3.556 lr:0.0000010 epoch_Time:36.0min: [2023-12-11 15:01:47,364][model2_sft.py][INFO] Epoch:[1/2](54300/63764) loss:3.739 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 15:01:58,514][model2_sft.py][INFO] Epoch:[1/2](54350/63764) loss:2.832 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 15:02:09,693][model2_sft.py][INFO] Epoch:[1/2](54400/63764) loss:3.234 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 15:02:20,844][model2_sft.py][INFO] Epoch:[1/2](54450/63764) loss:3.790 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 15:02:31,967][model2_sft.py][INFO] Epoch:[1/2](54500/63764) loss:2.994 lr:0.0000010 epoch_Time:35.0min: [2023-12-11 15:02:43,161][model2_sft.py][INFO] Epoch:[1/2](54550/63764) loss:2.924 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 15:02:54,335][model2_sft.py][INFO] Epoch:[1/2](54600/63764) loss:3.325 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 15:03:05,536][model2_sft.py][INFO] Epoch:[1/2](54650/63764) loss:3.116 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 15:03:16,699][model2_sft.py][INFO] Epoch:[1/2](54700/63764) loss:3.564 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 15:03:27,808][model2_sft.py][INFO] Epoch:[1/2](54750/63764) loss:3.073 lr:0.0000010 epoch_Time:34.0min: [2023-12-11 15:03:38,929][model2_sft.py][INFO] Epoch:[1/2](54800/63764) loss:2.913 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 15:03:50,070][model2_sft.py][INFO] Epoch:[1/2](54850/63764) loss:3.624 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 15:04:01,252][model2_sft.py][INFO] Epoch:[1/2](54900/63764) loss:3.863 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 15:04:12,394][model2_sft.py][INFO] Epoch:[1/2](54950/63764) loss:3.533 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 15:04:23,541][model2_sft.py][INFO] Epoch:[1/2](55000/63764) loss:4.002 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 15:04:34,787][model2_sft.py][INFO] Epoch:[1/2](55050/63764) loss:3.301 lr:0.0000010 epoch_Time:33.0min: [2023-12-11 15:04:45,914][model2_sft.py][INFO] Epoch:[1/2](55100/63764) loss:3.233 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 15:04:57,060][model2_sft.py][INFO] Epoch:[1/2](55150/63764) loss:3.409 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 15:05:08,232][model2_sft.py][INFO] Epoch:[1/2](55200/63764) loss:3.542 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 15:05:19,362][model2_sft.py][INFO] Epoch:[1/2](55250/63764) loss:3.943 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 15:05:30,529][model2_sft.py][INFO] Epoch:[1/2](55300/63764) loss:3.337 lr:0.0000010 epoch_Time:32.0min: [2023-12-11 15:05:41,654][model2_sft.py][INFO] Epoch:[1/2](55350/63764) loss:3.585 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 15:05:52,797][model2_sft.py][INFO] Epoch:[1/2](55400/63764) loss:3.686 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 15:06:03,960][model2_sft.py][INFO] Epoch:[1/2](55450/63764) loss:3.967 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 15:06:15,155][model2_sft.py][INFO] Epoch:[1/2](55500/63764) loss:3.892 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 15:06:26,292][model2_sft.py][INFO] Epoch:[1/2](55550/63764) loss:2.672 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 15:06:37,485][model2_sft.py][INFO] Epoch:[1/2](55600/63764) loss:3.104 lr:0.0000010 epoch_Time:31.0min: [2023-12-11 15:06:48,630][model2_sft.py][INFO] Epoch:[1/2](55650/63764) loss:3.736 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 15:06:59,802][model2_sft.py][INFO] Epoch:[1/2](55700/63764) loss:3.667 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 15:07:10,945][model2_sft.py][INFO] Epoch:[1/2](55750/63764) loss:3.214 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 15:07:22,112][model2_sft.py][INFO] Epoch:[1/2](55800/63764) loss:3.218 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 15:07:33,278][model2_sft.py][INFO] Epoch:[1/2](55850/63764) loss:2.569 lr:0.0000010 epoch_Time:30.0min: [2023-12-11 15:07:44,462][model2_sft.py][INFO] Epoch:[1/2](55900/63764) loss:3.797 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 15:07:55,564][model2_sft.py][INFO] Epoch:[1/2](55950/63764) loss:3.262 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 15:08:06,702][model2_sft.py][INFO] Epoch:[1/2](56000/63764) loss:3.641 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 15:08:17,805][model2_sft.py][INFO] Epoch:[1/2](56050/63764) loss:3.322 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 15:08:28,958][model2_sft.py][INFO] Epoch:[1/2](56100/63764) loss:3.820 lr:0.0000010 epoch_Time:29.0min: [2023-12-11 15:08:40,136][model2_sft.py][INFO] Epoch:[1/2](56150/63764) loss:3.330 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 15:08:51,297][model2_sft.py][INFO] Epoch:[1/2](56200/63764) loss:3.294 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 15:09:02,414][model2_sft.py][INFO] Epoch:[1/2](56250/63764) loss:3.346 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 15:09:13,520][model2_sft.py][INFO] Epoch:[1/2](56300/63764) loss:2.939 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 15:09:24,648][model2_sft.py][INFO] Epoch:[1/2](56350/63764) loss:4.372 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 15:09:35,822][model2_sft.py][INFO] Epoch:[1/2](56400/63764) loss:3.558 lr:0.0000010 epoch_Time:28.0min: [2023-12-11 15:09:46,938][model2_sft.py][INFO] Epoch:[1/2](56450/63764) loss:3.484 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 15:09:58,087][model2_sft.py][INFO] Epoch:[1/2](56500/63764) loss:3.643 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 15:10:09,277][model2_sft.py][INFO] Epoch:[1/2](56550/63764) loss:3.619 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 15:10:20,477][model2_sft.py][INFO] Epoch:[1/2](56600/63764) loss:3.398 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 15:10:31,612][model2_sft.py][INFO] Epoch:[1/2](56650/63764) loss:3.892 lr:0.0000010 epoch_Time:27.0min: [2023-12-11 15:10:42,790][model2_sft.py][INFO] Epoch:[1/2](56700/63764) loss:3.037 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 15:10:53,946][model2_sft.py][INFO] Epoch:[1/2](56750/63764) loss:3.563 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 15:11:05,072][model2_sft.py][INFO] Epoch:[1/2](56800/63764) loss:3.187 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 15:11:16,213][model2_sft.py][INFO] Epoch:[1/2](56850/63764) loss:3.689 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 15:11:27,349][model2_sft.py][INFO] Epoch:[1/2](56900/63764) loss:3.072 lr:0.0000010 epoch_Time:26.0min: [2023-12-11 15:11:38,557][model2_sft.py][INFO] Epoch:[1/2](56950/63764) loss:3.772 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 15:11:49,768][model2_sft.py][INFO] Epoch:[1/2](57000/63764) loss:3.378 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 15:12:00,908][model2_sft.py][INFO] Epoch:[1/2](57050/63764) loss:3.776 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 15:12:12,052][model2_sft.py][INFO] Epoch:[1/2](57100/63764) loss:2.874 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 15:12:23,192][model2_sft.py][INFO] Epoch:[1/2](57150/63764) loss:3.652 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 15:12:34,376][model2_sft.py][INFO] Epoch:[1/2](57200/63764) loss:3.141 lr:0.0000010 epoch_Time:25.0min: [2023-12-11 15:12:45,508][model2_sft.py][INFO] Epoch:[1/2](57250/63764) loss:3.878 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 15:12:56,671][model2_sft.py][INFO] Epoch:[1/2](57300/63764) loss:2.825 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 15:13:07,790][model2_sft.py][INFO] Epoch:[1/2](57350/63764) loss:3.702 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 15:13:18,965][model2_sft.py][INFO] Epoch:[1/2](57400/63764) loss:3.165 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 15:13:30,149][model2_sft.py][INFO] Epoch:[1/2](57450/63764) loss:2.825 lr:0.0000010 epoch_Time:24.0min: [2023-12-11 15:13:41,293][model2_sft.py][INFO] Epoch:[1/2](57500/63764) loss:3.385 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 15:13:52,467][model2_sft.py][INFO] Epoch:[1/2](57550/63764) loss:3.331 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 15:14:03,667][model2_sft.py][INFO] Epoch:[1/2](57600/63764) loss:3.564 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 15:14:14,840][model2_sft.py][INFO] Epoch:[1/2](57650/63764) loss:3.436 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 15:14:26,018][model2_sft.py][INFO] Epoch:[1/2](57700/63764) loss:2.754 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 15:14:37,205][model2_sft.py][INFO] Epoch:[1/2](57750/63764) loss:3.266 lr:0.0000010 epoch_Time:23.0min: [2023-12-11 15:14:48,369][model2_sft.py][INFO] Epoch:[1/2](57800/63764) loss:3.910 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 15:14:59,579][model2_sft.py][INFO] Epoch:[1/2](57850/63764) loss:3.241 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 15:15:10,776][model2_sft.py][INFO] Epoch:[1/2](57900/63764) loss:3.051 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 15:15:21,933][model2_sft.py][INFO] Epoch:[1/2](57950/63764) loss:2.981 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 15:15:33,050][model2_sft.py][INFO] Epoch:[1/2](58000/63764) loss:3.054 lr:0.0000010 epoch_Time:22.0min: [2023-12-11 15:15:44,204][model2_sft.py][INFO] Epoch:[1/2](58050/63764) loss:3.767 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 15:15:55,382][model2_sft.py][INFO] Epoch:[1/2](58100/63764) loss:3.441 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 15:16:06,483][model2_sft.py][INFO] Epoch:[1/2](58150/63764) loss:3.615 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 15:16:17,695][model2_sft.py][INFO] Epoch:[1/2](58200/63764) loss:3.668 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 15:16:28,850][model2_sft.py][INFO] Epoch:[1/2](58250/63764) loss:3.324 lr:0.0000010 epoch_Time:21.0min: [2023-12-11 15:16:40,057][model2_sft.py][INFO] Epoch:[1/2](58300/63764) loss:2.874 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 15:16:51,199][model2_sft.py][INFO] Epoch:[1/2](58350/63764) loss:3.799 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 15:17:02,332][model2_sft.py][INFO] Epoch:[1/2](58400/63764) loss:2.985 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 15:17:13,469][model2_sft.py][INFO] Epoch:[1/2](58450/63764) loss:2.938 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 15:17:24,659][model2_sft.py][INFO] Epoch:[1/2](58500/63764) loss:3.222 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 15:17:35,802][model2_sft.py][INFO] Epoch:[1/2](58550/63764) loss:3.949 lr:0.0000010 epoch_Time:20.0min: [2023-12-11 15:17:46,976][model2_sft.py][INFO] Epoch:[1/2](58600/63764) loss:3.327 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 15:17:58,166][model2_sft.py][INFO] Epoch:[1/2](58650/63764) loss:3.441 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 15:18:09,321][model2_sft.py][INFO] Epoch:[1/2](58700/63764) loss:3.119 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 15:18:20,484][model2_sft.py][INFO] Epoch:[1/2](58750/63764) loss:3.619 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 15:18:31,658][model2_sft.py][INFO] Epoch:[1/2](58800/63764) loss:3.502 lr:0.0000010 epoch_Time:19.0min: [2023-12-11 15:18:42,833][model2_sft.py][INFO] Epoch:[1/2](58850/63764) loss:3.301 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 15:18:54,000][model2_sft.py][INFO] Epoch:[1/2](58900/63764) loss:3.398 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 15:19:05,151][model2_sft.py][INFO] Epoch:[1/2](58950/63764) loss:3.308 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 15:19:16,321][model2_sft.py][INFO] Epoch:[1/2](59000/63764) loss:3.224 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 15:19:27,507][model2_sft.py][INFO] Epoch:[1/2](59050/63764) loss:2.983 lr:0.0000010 epoch_Time:18.0min: [2023-12-11 15:19:38,652][model2_sft.py][INFO] Epoch:[1/2](59100/63764) loss:3.614 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 15:19:49,839][model2_sft.py][INFO] Epoch:[1/2](59150/63764) loss:3.338 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 15:20:00,974][model2_sft.py][INFO] Epoch:[1/2](59200/63764) loss:3.682 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 15:20:12,134][model2_sft.py][INFO] Epoch:[1/2](59250/63764) loss:2.901 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 15:20:23,279][model2_sft.py][INFO] Epoch:[1/2](59300/63764) loss:2.735 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 15:20:34,451][model2_sft.py][INFO] Epoch:[1/2](59350/63764) loss:3.492 lr:0.0000010 epoch_Time:17.0min: [2023-12-11 15:20:45,613][model2_sft.py][INFO] Epoch:[1/2](59400/63764) loss:2.814 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 15:20:56,773][model2_sft.py][INFO] Epoch:[1/2](59450/63764) loss:2.855 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 15:21:07,943][model2_sft.py][INFO] Epoch:[1/2](59500/63764) loss:3.354 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 15:21:19,093][model2_sft.py][INFO] Epoch:[1/2](59550/63764) loss:3.438 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 15:21:30,279][model2_sft.py][INFO] Epoch:[1/2](59600/63764) loss:3.654 lr:0.0000010 epoch_Time:16.0min: [2023-12-11 15:21:41,409][model2_sft.py][INFO] Epoch:[1/2](59650/63764) loss:3.059 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 15:21:52,561][model2_sft.py][INFO] Epoch:[1/2](59700/63764) loss:3.959 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 15:22:03,710][model2_sft.py][INFO] Epoch:[1/2](59750/63764) loss:3.094 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 15:22:14,857][model2_sft.py][INFO] Epoch:[1/2](59800/63764) loss:2.455 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 15:22:26,038][model2_sft.py][INFO] Epoch:[1/2](59850/63764) loss:3.679 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 15:22:37,166][model2_sft.py][INFO] Epoch:[1/2](59900/63764) loss:2.971 lr:0.0000010 epoch_Time:15.0min: [2023-12-11 15:22:48,292][model2_sft.py][INFO] Epoch:[1/2](59950/63764) loss:3.652 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 15:22:59,459][model2_sft.py][INFO] Epoch:[1/2](60000/63764) loss:3.171 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 15:23:10,619][model2_sft.py][INFO] Epoch:[1/2](60050/63764) loss:3.743 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 15:23:21,737][model2_sft.py][INFO] Epoch:[1/2](60100/63764) loss:3.288 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 15:23:32,872][model2_sft.py][INFO] Epoch:[1/2](60150/63764) loss:3.751 lr:0.0000010 epoch_Time:14.0min: [2023-12-11 15:23:43,977][model2_sft.py][INFO] Epoch:[1/2](60200/63764) loss:3.465 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 15:23:55,130][model2_sft.py][INFO] Epoch:[1/2](60250/63764) loss:3.403 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 15:24:06,262][model2_sft.py][INFO] Epoch:[1/2](60300/63764) loss:3.296 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 15:24:17,367][model2_sft.py][INFO] Epoch:[1/2](60350/63764) loss:3.559 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 15:24:28,546][model2_sft.py][INFO] Epoch:[1/2](60400/63764) loss:3.205 lr:0.0000010 epoch_Time:13.0min: [2023-12-11 15:24:39,681][model2_sft.py][INFO] Epoch:[1/2](60450/63764) loss:3.507 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 15:24:50,845][model2_sft.py][INFO] Epoch:[1/2](60500/63764) loss:3.796 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 15:25:01,965][model2_sft.py][INFO] Epoch:[1/2](60550/63764) loss:2.865 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 15:25:13,134][model2_sft.py][INFO] Epoch:[1/2](60600/63764) loss:3.811 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 15:25:24,313][model2_sft.py][INFO] Epoch:[1/2](60650/63764) loss:3.122 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 15:25:35,533][model2_sft.py][INFO] Epoch:[1/2](60700/63764) loss:3.240 lr:0.0000010 epoch_Time:12.0min: [2023-12-11 15:25:46,684][model2_sft.py][INFO] Epoch:[1/2](60750/63764) loss:3.168 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 15:25:57,831][model2_sft.py][INFO] Epoch:[1/2](60800/63764) loss:3.143 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 15:26:09,037][model2_sft.py][INFO] Epoch:[1/2](60850/63764) loss:3.771 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 15:26:20,175][model2_sft.py][INFO] Epoch:[1/2](60900/63764) loss:3.320 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 15:26:31,348][model2_sft.py][INFO] Epoch:[1/2](60950/63764) loss:2.926 lr:0.0000010 epoch_Time:11.0min: [2023-12-11 15:26:42,516][model2_sft.py][INFO] Epoch:[1/2](61000/63764) loss:3.176 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 15:26:53,653][model2_sft.py][INFO] Epoch:[1/2](61050/63764) loss:3.399 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 15:27:04,790][model2_sft.py][INFO] Epoch:[1/2](61100/63764) loss:4.048 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 15:27:15,887][model2_sft.py][INFO] Epoch:[1/2](61150/63764) loss:3.166 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 15:27:27,050][model2_sft.py][INFO] Epoch:[1/2](61200/63764) loss:3.916 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 15:27:38,202][model2_sft.py][INFO] Epoch:[1/2](61250/63764) loss:2.959 lr:0.0000010 epoch_Time:10.0min: [2023-12-11 15:27:49,335][model2_sft.py][INFO] Epoch:[1/2](61300/63764) loss:3.619 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 15:28:00,497][model2_sft.py][INFO] Epoch:[1/2](61350/63764) loss:4.122 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 15:28:11,656][model2_sft.py][INFO] Epoch:[1/2](61400/63764) loss:4.311 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 15:28:22,793][model2_sft.py][INFO] Epoch:[1/2](61450/63764) loss:3.612 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 15:28:33,947][model2_sft.py][INFO] Epoch:[1/2](61500/63764) loss:3.267 lr:0.0000010 epoch_Time:9.0min: [2023-12-11 15:28:45,098][model2_sft.py][INFO] Epoch:[1/2](61550/63764) loss:3.254 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 15:28:56,234][model2_sft.py][INFO] Epoch:[1/2](61600/63764) loss:3.532 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 15:29:07,400][model2_sft.py][INFO] Epoch:[1/2](61650/63764) loss:2.819 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 15:29:18,563][model2_sft.py][INFO] Epoch:[1/2](61700/63764) loss:2.720 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 15:29:29,748][model2_sft.py][INFO] Epoch:[1/2](61750/63764) loss:3.425 lr:0.0000010 epoch_Time:8.0min: [2023-12-11 15:29:40,899][model2_sft.py][INFO] Epoch:[1/2](61800/63764) loss:3.343 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 15:29:52,036][model2_sft.py][INFO] Epoch:[1/2](61850/63764) loss:3.259 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 15:30:03,246][model2_sft.py][INFO] Epoch:[1/2](61900/63764) loss:3.260 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 15:30:14,427][model2_sft.py][INFO] Epoch:[1/2](61950/63764) loss:3.554 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 15:30:25,566][model2_sft.py][INFO] Epoch:[1/2](62000/63764) loss:2.870 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 15:30:36,721][model2_sft.py][INFO] Epoch:[1/2](62050/63764) loss:3.729 lr:0.0000010 epoch_Time:7.0min: [2023-12-11 15:30:47,893][model2_sft.py][INFO] Epoch:[1/2](62100/63764) loss:3.150 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 15:30:59,023][model2_sft.py][INFO] Epoch:[1/2](62150/63764) loss:2.976 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 15:31:10,178][model2_sft.py][INFO] Epoch:[1/2](62200/63764) loss:3.226 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 15:31:21,411][model2_sft.py][INFO] Epoch:[1/2](62250/63764) loss:3.567 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 15:31:32,533][model2_sft.py][INFO] Epoch:[1/2](62300/63764) loss:3.119 lr:0.0000010 epoch_Time:6.0min: [2023-12-11 15:31:43,725][model2_sft.py][INFO] Epoch:[1/2](62350/63764) loss:2.395 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 15:31:54,891][model2_sft.py][INFO] Epoch:[1/2](62400/63764) loss:3.500 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 15:32:06,067][model2_sft.py][INFO] Epoch:[1/2](62450/63764) loss:3.620 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 15:32:17,283][model2_sft.py][INFO] Epoch:[1/2](62500/63764) loss:2.906 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 15:32:28,416][model2_sft.py][INFO] Epoch:[1/2](62550/63764) loss:3.040 lr:0.0000010 epoch_Time:5.0min: [2023-12-11 15:32:39,558][model2_sft.py][INFO] Epoch:[1/2](62600/63764) loss:3.868 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 15:32:50,734][model2_sft.py][INFO] Epoch:[1/2](62650/63764) loss:2.896 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 15:33:01,857][model2_sft.py][INFO] Epoch:[1/2](62700/63764) loss:3.540 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 15:33:12,991][model2_sft.py][INFO] Epoch:[1/2](62750/63764) loss:3.640 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 15:33:24,105][model2_sft.py][INFO] Epoch:[1/2](62800/63764) loss:3.831 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 15:33:35,264][model2_sft.py][INFO] Epoch:[1/2](62850/63764) loss:3.989 lr:0.0000010 epoch_Time:4.0min: [2023-12-11 15:33:46,406][model2_sft.py][INFO] Epoch:[1/2](62900/63764) loss:3.713 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 15:33:57,549][model2_sft.py][INFO] Epoch:[1/2](62950/63764) loss:3.411 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 15:34:08,683][model2_sft.py][INFO] Epoch:[1/2](63000/63764) loss:4.028 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 15:34:19,864][model2_sft.py][INFO] Epoch:[1/2](63050/63764) loss:2.570 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 15:34:30,988][model2_sft.py][INFO] Epoch:[1/2](63100/63764) loss:3.277 lr:0.0000010 epoch_Time:3.0min: [2023-12-11 15:34:42,148][model2_sft.py][INFO] Epoch:[1/2](63150/63764) loss:3.139 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 15:34:53,302][model2_sft.py][INFO] Epoch:[1/2](63200/63764) loss:2.802 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 15:35:04,466][model2_sft.py][INFO] Epoch:[1/2](63250/63764) loss:3.375 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 15:35:15,607][model2_sft.py][INFO] Epoch:[1/2](63300/63764) loss:3.283 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 15:35:26,800][model2_sft.py][INFO] Epoch:[1/2](63350/63764) loss:3.444 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 15:35:37,945][model2_sft.py][INFO] Epoch:[1/2](63400/63764) loss:3.555 lr:0.0000010 epoch_Time:2.0min: [2023-12-11 15:35:49,089][model2_sft.py][INFO] Epoch:[1/2](63450/63764) loss:3.016 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 15:36:00,234][model2_sft.py][INFO] Epoch:[1/2](63500/63764) loss:3.011 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 15:36:11,342][model2_sft.py][INFO] Epoch:[1/2](63550/63764) loss:3.046 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 15:36:22,466][model2_sft.py][INFO] Epoch:[1/2](63600/63764) loss:3.168 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 15:36:33,630][model2_sft.py][INFO] Epoch:[1/2](63650/63764) loss:2.999 lr:0.0000010 epoch_Time:1.0min: [2023-12-11 15:36:44,798][model2_sft.py][INFO] Epoch:[1/2](63700/63764) loss:3.688 lr:0.0000010 epoch_Time:0.0min: [2023-12-11 15:36:56,001][model2_sft.py][INFO] Epoch:[1/2](63750/63764) loss:3.470 lr:0.0000010 epoch_Time:0.0min: