File size: 16,535 Bytes
5ee09e2
 
 
 
 
 
d122019
5ee09e2
 
 
 
 
 
 
 
 
 
 
 
 
d122019
5ee09e2
d122019
5ee09e2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
---
library_name: peft
license: gemma
base_model: google/gemma-3-1b-it
tags:
- llama-factory
- lora
- generated_from_trainer
datasets:
- super_glue
model-index:
- name: train_cb_1745950310
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# train_cb_1745950310

This model is a fine-tuned version of [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it) on the cb dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2460
- Num Input Tokens Seen: 22718312

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 123
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- training_steps: 40000

### Training results

| Training Loss | Epoch    | Step  | Validation Loss | Input Tokens Seen |
|:-------------:|:--------:|:-----:|:---------------:|:-----------------:|
| 0.1231        | 3.5133   | 200   | 0.2983          | 114504            |
| 0.0           | 7.0177   | 400   | 0.2772          | 228504            |
| 0.0           | 10.5310  | 600   | 0.2460          | 341136            |
| 0.0           | 14.0354  | 800   | 0.3245          | 455488            |
| 0.0           | 17.5487  | 1000  | 0.3226          | 569504            |
| 0.0           | 21.0531  | 1200  | 0.3285          | 682024            |
| 0.0           | 24.5664  | 1400  | 0.3362          | 796328            |
| 0.0           | 28.0708  | 1600  | 0.3420          | 909320            |
| 0.0           | 31.5841  | 1800  | 0.3482          | 1023696           |
| 0.0           | 35.0885  | 2000  | 0.3368          | 1137280           |
| 0.0           | 38.6018  | 2200  | 0.3498          | 1251592           |
| 0.0           | 42.1062  | 2400  | 0.3494          | 1364312           |
| 0.0           | 45.6195  | 2600  | 0.3590          | 1478704           |
| 0.0           | 49.1239  | 2800  | 0.3601          | 1591424           |
| 0.0           | 52.6372  | 3000  | 0.3583          | 1705000           |
| 0.0           | 56.1416  | 3200  | 0.3588          | 1818688           |
| 0.0           | 59.6549  | 3400  | 0.3548          | 1932248           |
| 0.0           | 63.1593  | 3600  | 0.3601          | 2045464           |
| 0.0           | 66.6726  | 3800  | 0.3658          | 2159128           |
| 0.0           | 70.1770  | 4000  | 0.3729          | 2272792           |
| 0.0           | 73.6903  | 4200  | 0.3782          | 2387344           |
| 0.0           | 77.1947  | 4400  | 0.3814          | 2500160           |
| 0.0           | 80.7080  | 4600  | 0.3626          | 2614032           |
| 0.0           | 84.2124  | 4800  | 0.3679          | 2728488           |
| 0.0           | 87.7257  | 5000  | 0.3792          | 2842656           |
| 0.0           | 91.2301  | 5200  | 0.3791          | 2956824           |
| 0.0           | 94.7434  | 5400  | 0.4004          | 3069840           |
| 0.0           | 98.2478  | 5600  | 0.3897          | 3183600           |
| 0.0           | 101.7611 | 5800  | 0.3824          | 3297896           |
| 0.0           | 105.2655 | 6000  | 0.3835          | 3411544           |
| 0.0           | 108.7788 | 6200  | 0.3907          | 3525472           |
| 0.0           | 112.2832 | 6400  | 0.4030          | 3638584           |
| 0.0           | 115.7965 | 6600  | 0.4009          | 3752608           |
| 0.0           | 119.3009 | 6800  | 0.4006          | 3865376           |
| 0.0           | 122.8142 | 7000  | 0.4033          | 3979464           |
| 0.0           | 126.3186 | 7200  | 0.4094          | 4093296           |
| 0.0           | 129.8319 | 7400  | 0.4080          | 4207120           |
| 0.0           | 133.3363 | 7600  | 0.4074          | 4320568           |
| 0.0           | 136.8496 | 7800  | 0.4120          | 4434056           |
| 0.0           | 140.3540 | 8000  | 0.4256          | 4547840           |
| 0.0           | 143.8673 | 8200  | 0.4117          | 4662192           |
| 0.0           | 147.3717 | 8400  | 0.4215          | 4774160           |
| 0.0           | 150.8850 | 8600  | 0.4241          | 4887640           |
| 0.0           | 154.3894 | 8800  | 0.4225          | 5002864           |
| 0.0           | 157.9027 | 9000  | 0.4309          | 5116216           |
| 0.0           | 161.4071 | 9200  | 0.4269          | 5229496           |
| 0.0           | 164.9204 | 9400  | 0.4272          | 5343528           |
| 0.0           | 168.4248 | 9600  | 0.4281          | 5455520           |
| 0.0           | 171.9381 | 9800  | 0.4237          | 5571144           |
| 0.0           | 175.4425 | 10000 | 0.4401          | 5684752           |
| 0.0           | 178.9558 | 10200 | 0.4291          | 5799088           |
| 0.0           | 182.4602 | 10400 | 0.4354          | 5911888           |
| 0.0           | 185.9735 | 10600 | 0.4433          | 6025544           |
| 0.0           | 189.4779 | 10800 | 0.4493          | 6139264           |
| 0.0           | 192.9912 | 11000 | 0.4488          | 6252832           |
| 0.0           | 196.4956 | 11200 | 0.4484          | 6366440           |
| 0.0           | 200.0    | 11400 | 0.4492          | 6478776           |
| 0.0           | 203.5133 | 11600 | 0.4521          | 6592280           |
| 0.0           | 207.0177 | 11800 | 0.4557          | 6704968           |
| 0.0           | 210.5310 | 12000 | 0.4463          | 6819568           |
| 0.0           | 214.0354 | 12200 | 0.4519          | 6933264           |
| 0.0           | 217.5487 | 12400 | 0.4537          | 7045688           |
| 0.0           | 221.0531 | 12600 | 0.4610          | 7159888           |
| 0.0           | 224.5664 | 12800 | 0.4564          | 7274296           |
| 0.0           | 228.0708 | 13000 | 0.4594          | 7387544           |
| 0.0           | 231.5841 | 13200 | 0.4661          | 7500200           |
| 0.0           | 235.0885 | 13400 | 0.4695          | 7614696           |
| 0.0           | 238.6018 | 13600 | 0.4755          | 7727608           |
| 0.0           | 242.1062 | 13800 | 0.4837          | 7840696           |
| 0.0           | 245.6195 | 14000 | 0.4702          | 7954632           |
| 0.0           | 249.1239 | 14200 | 0.4909          | 8068648           |
| 0.0           | 252.6372 | 14400 | 0.4822          | 8181840           |
| 0.0           | 256.1416 | 14600 | 0.4791          | 8294896           |
| 0.0           | 259.6549 | 14800 | 0.4915          | 8408512           |
| 0.0           | 263.1593 | 15000 | 0.4854          | 8522664           |
| 0.0           | 266.6726 | 15200 | 0.5012          | 8636032           |
| 0.0           | 270.1770 | 15400 | 0.5022          | 8748624           |
| 0.0           | 273.6903 | 15600 | 0.5095          | 8863248           |
| 0.0           | 277.1947 | 15800 | 0.5141          | 8976424           |
| 0.0           | 280.7080 | 16000 | 0.5122          | 9088984           |
| 0.0           | 284.2124 | 16200 | 0.5215          | 9204128           |
| 0.0           | 287.7257 | 16400 | 0.5182          | 9317208           |
| 0.0           | 291.2301 | 16600 | 0.5424          | 9431208           |
| 0.0           | 294.7434 | 16800 | 0.5420          | 9544328           |
| 0.0           | 298.2478 | 17000 | 0.5455          | 9657432           |
| 0.0           | 301.7611 | 17200 | 0.5556          | 9770824           |
| 0.0           | 305.2655 | 17400 | 0.5646          | 9884648           |
| 0.0           | 308.7788 | 17600 | 0.5576          | 9997288           |
| 0.0           | 312.2832 | 17800 | 0.5532          | 10111472          |
| 0.0           | 315.7965 | 18000 | 0.5568          | 10223648          |
| 0.0           | 319.3009 | 18200 | 0.5883          | 10336864          |
| 0.0           | 322.8142 | 18400 | 0.5703          | 10450688          |
| 0.0           | 326.3186 | 18600 | 0.5664          | 10563128          |
| 0.0           | 329.8319 | 18800 | 0.5949          | 10677928          |
| 0.0           | 333.3363 | 19000 | 0.5918          | 10790896          |
| 0.0           | 336.8496 | 19200 | 0.5862          | 10904600          |
| 0.0           | 340.3540 | 19400 | 0.5627          | 11018112          |
| 0.0           | 343.8673 | 19600 | 0.6012          | 11131712          |
| 0.0           | 347.3717 | 19800 | 0.5383          | 11245728          |
| 0.0           | 350.8850 | 20000 | 0.5387          | 11358800          |
| 0.0           | 354.3894 | 20200 | 0.5425          | 11471832          |
| 0.0           | 357.9027 | 20400 | 0.5417          | 11586368          |
| 0.0           | 361.4071 | 20600 | 0.5680          | 11700176          |
| 0.0           | 364.9204 | 20800 | 0.5215          | 11814304          |
| 0.0           | 368.4248 | 21000 | 0.5595          | 11927464          |
| 0.0           | 371.9381 | 21200 | 0.5175          | 12041416          |
| 0.0           | 375.4425 | 21400 | 0.5527          | 12153176          |
| 0.0           | 378.9558 | 21600 | 0.5344          | 12267984          |
| 0.0           | 382.4602 | 21800 | 0.5042          | 12381424          |
| 0.0           | 385.9735 | 22000 | 0.5430          | 12494280          |
| 0.0           | 389.4779 | 22200 | 0.5208          | 12608008          |
| 0.0           | 392.9912 | 22400 | 0.5807          | 12721456          |
| 0.0           | 396.4956 | 22600 | 0.5171          | 12835240          |
| 0.0           | 400.0    | 22800 | 0.5288          | 12948416          |
| 0.0           | 403.5133 | 23000 | 0.5604          | 13061472          |
| 0.0           | 407.0177 | 23200 | 0.5698          | 13175888          |
| 0.0           | 410.5310 | 23400 | 0.5086          | 13289752          |
| 0.0           | 414.0354 | 23600 | 0.4858          | 13403848          |
| 0.0           | 417.5487 | 23800 | 0.5353          | 13518496          |
| 0.0           | 421.0531 | 24000 | 0.4958          | 13631704          |
| 0.0           | 424.5664 | 24200 | 0.4936          | 13745200          |
| 0.0           | 428.0708 | 24400 | 0.5261          | 13859752          |
| 0.0           | 431.5841 | 24600 | 0.5022          | 13972648          |
| 0.0           | 435.0885 | 24800 | 0.5777          | 14086360          |
| 0.0           | 438.6018 | 25000 | 0.5152          | 14201656          |
| 0.0           | 442.1062 | 25200 | 0.5149          | 14314736          |
| 0.0           | 445.6195 | 25400 | 0.5318          | 14428104          |
| 0.0           | 449.1239 | 25600 | 0.4894          | 14541136          |
| 0.0           | 452.6372 | 25800 | 0.5164          | 14655696          |
| 0.0           | 456.1416 | 26000 | 0.5153          | 14768168          |
| 0.0           | 459.6549 | 26200 | 0.5005          | 14882048          |
| 0.0           | 463.1593 | 26400 | 0.5168          | 14996008          |
| 0.139         | 466.6726 | 26600 | 0.8271          | 15109352          |
| 0.0           | 470.1770 | 26800 | 0.9104          | 15223592          |
| 0.0           | 473.6903 | 27000 | 0.9009          | 15338072          |
| 0.0           | 477.1947 | 27200 | 0.9213          | 15451312          |
| 0.0           | 480.7080 | 27400 | 0.9220          | 15565784          |
| 0.0           | 484.2124 | 27600 | 0.9057          | 15679720          |
| 0.0           | 487.7257 | 27800 | 0.9155          | 15792680          |
| 0.0           | 491.2301 | 28000 | 0.9253          | 15906624          |
| 0.0           | 494.7434 | 28200 | 0.9103          | 16019936          |
| 0.0           | 498.2478 | 28400 | 0.9245          | 16133784          |
| 0.0           | 501.7611 | 28600 | 0.8963          | 16248200          |
| 0.0           | 505.2655 | 28800 | 0.9024          | 16361560          |
| 0.0           | 508.7788 | 29000 | 0.9256          | 16475624          |
| 0.0           | 512.2832 | 29200 | 0.9239          | 16588984          |
| 0.0           | 515.7965 | 29400 | 0.9102          | 16702496          |
| 0.0           | 519.3009 | 29600 | 0.9128          | 16816272          |
| 0.0           | 522.8142 | 29800 | 0.9139          | 16929072          |
| 0.0           | 526.3186 | 30000 | 0.9153          | 17043120          |
| 0.0           | 529.8319 | 30200 | 0.9343          | 17156344          |
| 0.0           | 533.3363 | 30400 | 0.9051          | 17268656          |
| 0.0           | 536.8496 | 30600 | 0.9375          | 17383696          |
| 0.0           | 540.3540 | 30800 | 0.9452          | 17495648          |
| 0.0           | 543.8673 | 31000 | 0.9113          | 17609616          |
| 0.0           | 547.3717 | 31200 | 0.9103          | 17723600          |
| 0.0           | 550.8850 | 31400 | 0.8986          | 17836576          |
| 0.0           | 554.3894 | 31600 | 0.8948          | 17949928          |
| 0.0           | 557.9027 | 31800 | 0.9036          | 18064576          |
| 0.0           | 561.4071 | 32000 | 0.9059          | 18177096          |
| 0.0           | 564.9204 | 32200 | 0.9259          | 18290608          |
| 0.0           | 568.4248 | 32400 | 0.9182          | 18404648          |
| 0.0           | 571.9381 | 32600 | 0.9214          | 18517216          |
| 0.0           | 575.4425 | 32800 | 0.9142          | 18631296          |
| 0.0           | 578.9558 | 33000 | 0.9106          | 18745416          |
| 0.0           | 582.4602 | 33200 | 0.9187          | 18857896          |
| 0.0           | 585.9735 | 33400 | 0.9218          | 18971344          |
| 0.0           | 589.4779 | 33600 | 0.9236          | 19085248          |
| 0.0           | 592.9912 | 33800 | 0.9061          | 19199136          |
| 0.0           | 596.4956 | 34000 | 0.8945          | 19311344          |
| 0.0           | 600.0    | 34200 | 0.8979          | 19425472          |
| 0.0           | 603.5133 | 34400 | 0.9250          | 19539112          |
| 0.0           | 607.0177 | 34600 | 0.9027          | 19652392          |
| 0.0           | 610.5310 | 34800 | 0.9087          | 19766904          |
| 0.0           | 614.0354 | 35000 | 0.8934          | 19879808          |
| 0.0           | 617.5487 | 35200 | 0.9040          | 19993952          |
| 0.0           | 621.0531 | 35400 | 0.9110          | 20107560          |
| 0.0           | 624.5664 | 35600 | 0.8989          | 20220888          |
| 0.0           | 628.0708 | 35800 | 0.9270          | 20333904          |
| 0.0           | 631.5841 | 36000 | 0.8909          | 20446736          |
| 0.0           | 635.0885 | 36200 | 0.9129          | 20560472          |
| 0.0           | 638.6018 | 36400 | 0.8988          | 20673984          |
| 0.0           | 642.1062 | 36600 | 0.8977          | 20786240          |
| 0.0           | 645.6195 | 36800 | 0.8956          | 20899128          |
| 0.0           | 649.1239 | 37000 | 0.9297          | 21011928          |
| 0.0           | 652.6372 | 37200 | 0.8970          | 21126880          |
| 0.0           | 656.1416 | 37400 | 0.9159          | 21239760          |
| 0.0           | 659.6549 | 37600 | 0.9120          | 21353776          |
| 0.0           | 663.1593 | 37800 | 0.8969          | 21467368          |
| 0.0           | 666.6726 | 38000 | 0.8925          | 21581512          |
| 0.0           | 670.1770 | 38200 | 0.8996          | 21694376          |
| 0.0           | 673.6903 | 38400 | 0.8811          | 21808568          |
| 0.0           | 677.1947 | 38600 | 0.9198          | 21922424          |
| 0.0           | 680.7080 | 38800 | 0.9037          | 22036600          |
| 0.0           | 684.2124 | 39000 | 0.8997          | 22150992          |
| 0.0           | 687.7257 | 39200 | 0.9019          | 22263616          |
| 0.0           | 691.2301 | 39400 | 0.8945          | 22377936          |
| 0.0           | 694.7434 | 39600 | 0.9180          | 22490328          |
| 0.0           | 698.2478 | 39800 | 0.9090          | 22604096          |
| 0.0           | 701.7611 | 40000 | 0.9120          | 22718312          |


### Framework versions

- PEFT 0.15.2.dev0
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1