Hiranmai49 commited on
Commit
b83accd
·
verified ·
1 Parent(s): e9d62c0

End of training

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 2.8722
20
 
21
  ## Model description
22
 
@@ -41,62 +41,212 @@ The following hyperparameters were used during training:
41
  - seed: 42
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
- - num_epochs: 50
45
 
46
  ### Training results
47
 
48
- | Training Loss | Epoch | Step | Validation Loss |
49
- |:-------------:|:-----:|:----:|:---------------:|
50
- | No log | 1.0 | 140 | 3.2358 |
51
- | No log | 2.0 | 280 | 3.1312 |
52
- | No log | 3.0 | 420 | 3.0747 |
53
- | 3.229 | 4.0 | 560 | 3.0344 |
54
- | 3.229 | 5.0 | 700 | 3.0087 |
55
- | 3.229 | 6.0 | 840 | 2.9826 |
56
- | 3.229 | 7.0 | 980 | 2.9612 |
57
- | 2.8994 | 8.0 | 1120 | 2.9485 |
58
- | 2.8994 | 9.0 | 1260 | 2.9314 |
59
- | 2.8994 | 10.0 | 1400 | 2.9206 |
60
- | 2.7362 | 11.0 | 1540 | 2.9058 |
61
- | 2.7362 | 12.0 | 1680 | 2.8936 |
62
- | 2.7362 | 13.0 | 1820 | 2.8910 |
63
- | 2.7362 | 14.0 | 1960 | 2.8837 |
64
- | 2.6111 | 15.0 | 2100 | 2.8820 |
65
- | 2.6111 | 16.0 | 2240 | 2.8754 |
66
- | 2.6111 | 17.0 | 2380 | 2.8715 |
67
- | 2.5132 | 18.0 | 2520 | 2.8670 |
68
- | 2.5132 | 19.0 | 2660 | 2.8645 |
69
- | 2.5132 | 20.0 | 2800 | 2.8602 |
70
- | 2.5132 | 21.0 | 2940 | 2.8605 |
71
- | 2.4321 | 22.0 | 3080 | 2.8604 |
72
- | 2.4321 | 23.0 | 3220 | 2.8552 |
73
- | 2.4321 | 24.0 | 3360 | 2.8564 |
74
- | 2.3645 | 25.0 | 3500 | 2.8564 |
75
- | 2.3645 | 26.0 | 3640 | 2.8613 |
76
- | 2.3645 | 27.0 | 3780 | 2.8560 |
77
- | 2.3645 | 28.0 | 3920 | 2.8510 |
78
- | 2.3077 | 29.0 | 4060 | 2.8535 |
79
- | 2.3077 | 30.0 | 4200 | 2.8528 |
80
- | 2.3077 | 31.0 | 4340 | 2.8585 |
81
- | 2.3077 | 32.0 | 4480 | 2.8610 |
82
- | 2.2607 | 33.0 | 4620 | 2.8625 |
83
- | 2.2607 | 34.0 | 4760 | 2.8602 |
84
- | 2.2607 | 35.0 | 4900 | 2.8643 |
85
- | 2.2233 | 36.0 | 5040 | 2.8591 |
86
- | 2.2233 | 37.0 | 5180 | 2.8647 |
87
- | 2.2233 | 38.0 | 5320 | 2.8638 |
88
- | 2.2233 | 39.0 | 5460 | 2.8657 |
89
- | 2.193 | 40.0 | 5600 | 2.8644 |
90
- | 2.193 | 41.0 | 5740 | 2.8620 |
91
- | 2.193 | 42.0 | 5880 | 2.8676 |
92
- | 2.1706 | 43.0 | 6020 | 2.8702 |
93
- | 2.1706 | 44.0 | 6160 | 2.8704 |
94
- | 2.1706 | 45.0 | 6300 | 2.8698 |
95
- | 2.1706 | 46.0 | 6440 | 2.8716 |
96
- | 2.155 | 47.0 | 6580 | 2.8714 |
97
- | 2.155 | 48.0 | 6720 | 2.8726 |
98
- | 2.155 | 49.0 | 6860 | 2.8718 |
99
- | 2.1472 | 50.0 | 7000 | 2.8722 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
 
101
 
102
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 3.7248
20
 
21
  ## Model description
22
 
 
41
  - seed: 42
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
+ - num_epochs: 200
45
 
46
  ### Training results
47
 
48
+ | Training Loss | Epoch | Step | Validation Loss |
49
+ |:-------------:|:-----:|:-----:|:---------------:|
50
+ | No log | 1.0 | 140 | 3.2347 |
51
+ | No log | 2.0 | 280 | 3.1292 |
52
+ | No log | 3.0 | 420 | 3.0718 |
53
+ | 3.2263 | 4.0 | 560 | 3.0302 |
54
+ | 3.2263 | 5.0 | 700 | 3.0034 |
55
+ | 3.2263 | 6.0 | 840 | 2.9764 |
56
+ | 3.2263 | 7.0 | 980 | 2.9535 |
57
+ | 2.8888 | 8.0 | 1120 | 2.9404 |
58
+ | 2.8888 | 9.0 | 1260 | 2.9223 |
59
+ | 2.8888 | 10.0 | 1400 | 2.9102 |
60
+ | 2.7132 | 11.0 | 1540 | 2.8945 |
61
+ | 2.7132 | 12.0 | 1680 | 2.8817 |
62
+ | 2.7132 | 13.0 | 1820 | 2.8795 |
63
+ | 2.7132 | 14.0 | 1960 | 2.8709 |
64
+ | 2.5718 | 15.0 | 2100 | 2.8711 |
65
+ | 2.5718 | 16.0 | 2240 | 2.8623 |
66
+ | 2.5718 | 17.0 | 2380 | 2.8590 |
67
+ | 2.454 | 18.0 | 2520 | 2.8583 |
68
+ | 2.454 | 19.0 | 2660 | 2.8538 |
69
+ | 2.454 | 20.0 | 2800 | 2.8498 |
70
+ | 2.454 | 21.0 | 2940 | 2.8482 |
71
+ | 2.3493 | 22.0 | 3080 | 2.8540 |
72
+ | 2.3493 | 23.0 | 3220 | 2.8540 |
73
+ | 2.3493 | 24.0 | 3360 | 2.8497 |
74
+ | 2.2555 | 25.0 | 3500 | 2.8531 |
75
+ | 2.2555 | 26.0 | 3640 | 2.8568 |
76
+ | 2.2555 | 27.0 | 3780 | 2.8589 |
77
+ | 2.2555 | 28.0 | 3920 | 2.8549 |
78
+ | 2.1675 | 29.0 | 4060 | 2.8618 |
79
+ | 2.1675 | 30.0 | 4200 | 2.8584 |
80
+ | 2.1675 | 31.0 | 4340 | 2.8713 |
81
+ | 2.1675 | 32.0 | 4480 | 2.8801 |
82
+ | 2.0887 | 33.0 | 4620 | 2.8836 |
83
+ | 2.0887 | 34.0 | 4760 | 2.8846 |
84
+ | 2.0887 | 35.0 | 4900 | 2.8889 |
85
+ | 2.0144 | 36.0 | 5040 | 2.8925 |
86
+ | 2.0144 | 37.0 | 5180 | 2.9043 |
87
+ | 2.0144 | 38.0 | 5320 | 2.9181 |
88
+ | 2.0144 | 39.0 | 5460 | 2.9156 |
89
+ | 1.9458 | 40.0 | 5600 | 2.9211 |
90
+ | 1.9458 | 41.0 | 5740 | 2.9174 |
91
+ | 1.9458 | 42.0 | 5880 | 2.9329 |
92
+ | 1.8813 | 43.0 | 6020 | 2.9373 |
93
+ | 1.8813 | 44.0 | 6160 | 2.9565 |
94
+ | 1.8813 | 45.0 | 6300 | 2.9679 |
95
+ | 1.8813 | 46.0 | 6440 | 2.9644 |
96
+ | 1.8202 | 47.0 | 6580 | 2.9661 |
97
+ | 1.8202 | 48.0 | 6720 | 2.9898 |
98
+ | 1.8202 | 49.0 | 6860 | 2.9825 |
99
+ | 1.7662 | 50.0 | 7000 | 3.0069 |
100
+ | 1.7662 | 51.0 | 7140 | 2.9936 |
101
+ | 1.7662 | 52.0 | 7280 | 3.0068 |
102
+ | 1.7662 | 53.0 | 7420 | 3.0241 |
103
+ | 1.7081 | 54.0 | 7560 | 3.0146 |
104
+ | 1.7081 | 55.0 | 7700 | 3.0353 |
105
+ | 1.7081 | 56.0 | 7840 | 3.0397 |
106
+ | 1.7081 | 57.0 | 7980 | 3.0516 |
107
+ | 1.6603 | 58.0 | 8120 | 3.0395 |
108
+ | 1.6603 | 59.0 | 8260 | 3.0648 |
109
+ | 1.6603 | 60.0 | 8400 | 3.0601 |
110
+ | 1.6118 | 61.0 | 8540 | 3.0792 |
111
+ | 1.6118 | 62.0 | 8680 | 3.0807 |
112
+ | 1.6118 | 63.0 | 8820 | 3.0881 |
113
+ | 1.6118 | 64.0 | 8960 | 3.0998 |
114
+ | 1.5671 | 65.0 | 9100 | 3.1146 |
115
+ | 1.5671 | 66.0 | 9240 | 3.1199 |
116
+ | 1.5671 | 67.0 | 9380 | 3.1376 |
117
+ | 1.5259 | 68.0 | 9520 | 3.1391 |
118
+ | 1.5259 | 69.0 | 9660 | 3.1383 |
119
+ | 1.5259 | 70.0 | 9800 | 3.1566 |
120
+ | 1.5259 | 71.0 | 9940 | 3.1595 |
121
+ | 1.4799 | 72.0 | 10080 | 3.1620 |
122
+ | 1.4799 | 73.0 | 10220 | 3.1931 |
123
+ | 1.4799 | 74.0 | 10360 | 3.1830 |
124
+ | 1.4444 | 75.0 | 10500 | 3.2015 |
125
+ | 1.4444 | 76.0 | 10640 | 3.2013 |
126
+ | 1.4444 | 77.0 | 10780 | 3.2113 |
127
+ | 1.4444 | 78.0 | 10920 | 3.2097 |
128
+ | 1.4056 | 79.0 | 11060 | 3.2505 |
129
+ | 1.4056 | 80.0 | 11200 | 3.2375 |
130
+ | 1.4056 | 81.0 | 11340 | 3.2439 |
131
+ | 1.4056 | 82.0 | 11480 | 3.2540 |
132
+ | 1.3708 | 83.0 | 11620 | 3.2550 |
133
+ | 1.3708 | 84.0 | 11760 | 3.2658 |
134
+ | 1.3708 | 85.0 | 11900 | 3.2830 |
135
+ | 1.3346 | 86.0 | 12040 | 3.2945 |
136
+ | 1.3346 | 87.0 | 12180 | 3.2818 |
137
+ | 1.3346 | 88.0 | 12320 | 3.3014 |
138
+ | 1.3346 | 89.0 | 12460 | 3.3156 |
139
+ | 1.3067 | 90.0 | 12600 | 3.3044 |
140
+ | 1.3067 | 91.0 | 12740 | 3.3089 |
141
+ | 1.3067 | 92.0 | 12880 | 3.3307 |
142
+ | 1.2757 | 93.0 | 13020 | 3.3314 |
143
+ | 1.2757 | 94.0 | 13160 | 3.3311 |
144
+ | 1.2757 | 95.0 | 13300 | 3.3384 |
145
+ | 1.2757 | 96.0 | 13440 | 3.3591 |
146
+ | 1.2473 | 97.0 | 13580 | 3.3783 |
147
+ | 1.2473 | 98.0 | 13720 | 3.3683 |
148
+ | 1.2473 | 99.0 | 13860 | 3.3692 |
149
+ | 1.2195 | 100.0 | 14000 | 3.3822 |
150
+ | 1.2195 | 101.0 | 14140 | 3.3860 |
151
+ | 1.2195 | 102.0 | 14280 | 3.3996 |
152
+ | 1.2195 | 103.0 | 14420 | 3.4089 |
153
+ | 1.1932 | 104.0 | 14560 | 3.4005 |
154
+ | 1.1932 | 105.0 | 14700 | 3.4273 |
155
+ | 1.1932 | 106.0 | 14840 | 3.4225 |
156
+ | 1.1932 | 107.0 | 14980 | 3.4323 |
157
+ | 1.17 | 108.0 | 15120 | 3.4463 |
158
+ | 1.17 | 109.0 | 15260 | 3.4339 |
159
+ | 1.17 | 110.0 | 15400 | 3.4574 |
160
+ | 1.1454 | 111.0 | 15540 | 3.4486 |
161
+ | 1.1454 | 112.0 | 15680 | 3.4675 |
162
+ | 1.1454 | 113.0 | 15820 | 3.4642 |
163
+ | 1.1454 | 114.0 | 15960 | 3.4600 |
164
+ | 1.1233 | 115.0 | 16100 | 3.4770 |
165
+ | 1.1233 | 116.0 | 16240 | 3.4924 |
166
+ | 1.1233 | 117.0 | 16380 | 3.5105 |
167
+ | 1.1039 | 118.0 | 16520 | 3.4984 |
168
+ | 1.1039 | 119.0 | 16660 | 3.4944 |
169
+ | 1.1039 | 120.0 | 16800 | 3.5021 |
170
+ | 1.1039 | 121.0 | 16940 | 3.5207 |
171
+ | 1.0811 | 122.0 | 17080 | 3.5109 |
172
+ | 1.0811 | 123.0 | 17220 | 3.5304 |
173
+ | 1.0811 | 124.0 | 17360 | 3.5331 |
174
+ | 1.0667 | 125.0 | 17500 | 3.5376 |
175
+ | 1.0667 | 126.0 | 17640 | 3.5428 |
176
+ | 1.0667 | 127.0 | 17780 | 3.5428 |
177
+ | 1.0667 | 128.0 | 17920 | 3.5569 |
178
+ | 1.0455 | 129.0 | 18060 | 3.5547 |
179
+ | 1.0455 | 130.0 | 18200 | 3.5542 |
180
+ | 1.0455 | 131.0 | 18340 | 3.5582 |
181
+ | 1.0455 | 132.0 | 18480 | 3.5717 |
182
+ | 1.0291 | 133.0 | 18620 | 3.5706 |
183
+ | 1.0291 | 134.0 | 18760 | 3.5743 |
184
+ | 1.0291 | 135.0 | 18900 | 3.5715 |
185
+ | 1.0146 | 136.0 | 19040 | 3.5843 |
186
+ | 1.0146 | 137.0 | 19180 | 3.5934 |
187
+ | 1.0146 | 138.0 | 19320 | 3.5871 |
188
+ | 1.0146 | 139.0 | 19460 | 3.6023 |
189
+ | 1.0003 | 140.0 | 19600 | 3.5991 |
190
+ | 1.0003 | 141.0 | 19740 | 3.6028 |
191
+ | 1.0003 | 142.0 | 19880 | 3.6090 |
192
+ | 0.9884 | 143.0 | 20020 | 3.6132 |
193
+ | 0.9884 | 144.0 | 20160 | 3.6175 |
194
+ | 0.9884 | 145.0 | 20300 | 3.6128 |
195
+ | 0.9884 | 146.0 | 20440 | 3.6239 |
196
+ | 0.9739 | 147.0 | 20580 | 3.6257 |
197
+ | 0.9739 | 148.0 | 20720 | 3.6397 |
198
+ | 0.9739 | 149.0 | 20860 | 3.6338 |
199
+ | 0.9627 | 150.0 | 21000 | 3.6318 |
200
+ | 0.9627 | 151.0 | 21140 | 3.6423 |
201
+ | 0.9627 | 152.0 | 21280 | 3.6473 |
202
+ | 0.9627 | 153.0 | 21420 | 3.6600 |
203
+ | 0.9506 | 154.0 | 21560 | 3.6536 |
204
+ | 0.9506 | 155.0 | 21700 | 3.6573 |
205
+ | 0.9506 | 156.0 | 21840 | 3.6611 |
206
+ | 0.9506 | 157.0 | 21980 | 3.6566 |
207
+ | 0.9396 | 158.0 | 22120 | 3.6641 |
208
+ | 0.9396 | 159.0 | 22260 | 3.6612 |
209
+ | 0.9396 | 160.0 | 22400 | 3.6659 |
210
+ | 0.9299 | 161.0 | 22540 | 3.6762 |
211
+ | 0.9299 | 162.0 | 22680 | 3.6763 |
212
+ | 0.9299 | 163.0 | 22820 | 3.6769 |
213
+ | 0.9299 | 164.0 | 22960 | 3.6785 |
214
+ | 0.9216 | 165.0 | 23100 | 3.6863 |
215
+ | 0.9216 | 166.0 | 23240 | 3.6955 |
216
+ | 0.9216 | 167.0 | 23380 | 3.6925 |
217
+ | 0.9163 | 168.0 | 23520 | 3.6870 |
218
+ | 0.9163 | 169.0 | 23660 | 3.6948 |
219
+ | 0.9163 | 170.0 | 23800 | 3.7016 |
220
+ | 0.9163 | 171.0 | 23940 | 3.6943 |
221
+ | 0.9064 | 172.0 | 24080 | 3.7027 |
222
+ | 0.9064 | 173.0 | 24220 | 3.6991 |
223
+ | 0.9064 | 174.0 | 24360 | 3.6993 |
224
+ | 0.902 | 175.0 | 24500 | 3.7033 |
225
+ | 0.902 | 176.0 | 24640 | 3.7069 |
226
+ | 0.902 | 177.0 | 24780 | 3.7101 |
227
+ | 0.902 | 178.0 | 24920 | 3.7101 |
228
+ | 0.8958 | 179.0 | 25060 | 3.7083 |
229
+ | 0.8958 | 180.0 | 25200 | 3.7167 |
230
+ | 0.8958 | 181.0 | 25340 | 3.7114 |
231
+ | 0.8958 | 182.0 | 25480 | 3.7122 |
232
+ | 0.8928 | 183.0 | 25620 | 3.7084 |
233
+ | 0.8928 | 184.0 | 25760 | 3.7175 |
234
+ | 0.8928 | 185.0 | 25900 | 3.7166 |
235
+ | 0.8884 | 186.0 | 26040 | 3.7169 |
236
+ | 0.8884 | 187.0 | 26180 | 3.7158 |
237
+ | 0.8884 | 188.0 | 26320 | 3.7199 |
238
+ | 0.8884 | 189.0 | 26460 | 3.7207 |
239
+ | 0.884 | 190.0 | 26600 | 3.7203 |
240
+ | 0.884 | 191.0 | 26740 | 3.7192 |
241
+ | 0.884 | 192.0 | 26880 | 3.7205 |
242
+ | 0.8807 | 193.0 | 27020 | 3.7223 |
243
+ | 0.8807 | 194.0 | 27160 | 3.7206 |
244
+ | 0.8807 | 195.0 | 27300 | 3.7265 |
245
+ | 0.8807 | 196.0 | 27440 | 3.7246 |
246
+ | 0.8781 | 197.0 | 27580 | 3.7252 |
247
+ | 0.8781 | 198.0 | 27720 | 3.7243 |
248
+ | 0.8781 | 199.0 | 27860 | 3.7247 |
249
+ | 0.8781 | 200.0 | 28000 | 3.7248 |
250
 
251
 
252
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:02bfb307622d7779f9f7b1ad2c93d087718984ed09d18541400c767314e53cd5
3
  size 327657928
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d77a66c05a9a46fc7201bd251c2013b8ed4b44588a9901a7d3cd774b1fccb65
3
  size 327657928
runs/Dec11_14-52-33_ltrcgpu2/events.out.tfevents.1733908956.ltrcgpu2.533660.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:59c753f0da4a010fba69a63813ea234c89243cdd0fb86571ccda14a15f54d6eb
3
- size 70609
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5d09bdbb9efd6e05c0c0eebd98a41c48517affc71f811fae9f8024fcfcfcbd5
3
+ size 72288
runs/Dec11_14-52-33_ltrcgpu2/events.out.tfevents.1733913582.ltrcgpu2.533660.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2866eb28a53c688940173edf45efcac94e11b4c9af4048f7145e520cbdaa49ea
3
+ size 364