yigagilbert commited on
Commit
b2261ea
·
verified ·
1 Parent(s): 9bd88be

End of training

Browse files
Files changed (6) hide show
  1. README.md +127 -0
  2. config.json +174 -0
  3. model.safetensors +3 -0
  4. tokenizer.json +0 -0
  5. tokenizer_config.json +114 -0
  6. training_args.bin +3 -0
README.md ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: google/t5-efficient-tiny
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - accuracy
9
+ - precision
10
+ - recall
11
+ - f1
12
+ model-index:
13
+ - name: sunflower_language_classification_v1
14
+ results: []
15
+ ---
16
+
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
19
+
20
+ # sunflower_language_classification_v1
21
+
22
+ This model is a fine-tuned version of [google/t5-efficient-tiny](https://huggingface.co/google/t5-efficient-tiny) on the None dataset.
23
+ It achieves the following results on the evaluation set:
24
+ - Loss: 0.7212
25
+ - Accuracy: 0.8297
26
+ - Precision: 0.8471
27
+ - Recall: 0.8297
28
+ - F1: 0.8191
29
+
30
+ ## Model description
31
+
32
+ More information needed
33
+
34
+ ## Intended uses & limitations
35
+
36
+ More information needed
37
+
38
+ ## Training and evaluation data
39
+
40
+ More information needed
41
+
42
+ ## Training procedure
43
+
44
+ ### Training hyperparameters
45
+
46
+ The following hyperparameters were used during training:
47
+ - learning_rate: 0.001
48
+ - train_batch_size: 64
49
+ - eval_batch_size: 64
50
+ - seed: 42
51
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
+ - lr_scheduler_type: linear
53
+ - lr_scheduler_warmup_steps: 10
54
+ - training_steps: 30000
55
+
56
+ ### Training results
57
+
58
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
59
+ |:-------------:|:------:|:-----:|:---------------:|:--------:|:---------:|:------:|:------:|
60
+ | 2.3995 | 0.0167 | 500 | 2.0015 | 0.5145 | 0.4412 | 0.5145 | 0.4517 |
61
+ | 1.3282 | 0.0334 | 1000 | 1.6467 | 0.5688 | 0.4908 | 0.5688 | 0.5080 |
62
+ | 1.1086 | 0.0502 | 1500 | 1.5051 | 0.6304 | 0.5784 | 0.6304 | 0.5766 |
63
+ | 0.9882 | 0.0669 | 2000 | 1.4518 | 0.6268 | 0.6374 | 0.6268 | 0.5891 |
64
+ | 0.9187 | 0.0836 | 2500 | 1.3470 | 0.6522 | 0.6245 | 0.6522 | 0.6093 |
65
+ | 0.8546 | 0.1003 | 3000 | 1.3747 | 0.6159 | 0.5871 | 0.6159 | 0.5760 |
66
+ | 0.8214 | 0.1170 | 3500 | 1.2708 | 0.6703 | 0.6316 | 0.6703 | 0.6323 |
67
+ | 0.7843 | 0.1338 | 4000 | 1.1659 | 0.6848 | 0.6639 | 0.6848 | 0.6461 |
68
+ | 0.7470 | 0.1505 | 4500 | 1.1969 | 0.6848 | 0.6534 | 0.6848 | 0.6491 |
69
+ | 0.7299 | 0.1672 | 5000 | 1.0592 | 0.7101 | 0.7030 | 0.7101 | 0.6748 |
70
+ | 0.7041 | 0.1839 | 5500 | 1.0536 | 0.6848 | 0.6728 | 0.6848 | 0.6534 |
71
+ | 0.6755 | 0.2006 | 6000 | 1.0265 | 0.7138 | 0.7298 | 0.7138 | 0.6852 |
72
+ | 0.6683 | 0.2174 | 6500 | 1.0049 | 0.7428 | 0.7403 | 0.7428 | 0.7089 |
73
+ | 0.6573 | 0.2341 | 7000 | 1.0702 | 0.7029 | 0.7052 | 0.7029 | 0.6764 |
74
+ | 0.6372 | 0.2508 | 7500 | 1.0260 | 0.7210 | 0.7143 | 0.7210 | 0.6998 |
75
+ | 0.6173 | 0.2675 | 8000 | 0.9654 | 0.7428 | 0.7492 | 0.7428 | 0.7141 |
76
+ | 0.6009 | 0.2842 | 8500 | 1.0185 | 0.7464 | 0.7504 | 0.7464 | 0.7167 |
77
+ | 0.5924 | 0.3010 | 9000 | 1.0028 | 0.7283 | 0.7652 | 0.7283 | 0.7052 |
78
+ | 0.5916 | 0.3177 | 9500 | 0.9581 | 0.7174 | 0.7217 | 0.7174 | 0.6893 |
79
+ | 0.5806 | 0.3344 | 10000 | 1.0011 | 0.7355 | 0.7618 | 0.7355 | 0.7149 |
80
+ | 0.5672 | 0.3511 | 10500 | 0.8978 | 0.7572 | 0.7429 | 0.7572 | 0.7307 |
81
+ | 0.5580 | 0.3678 | 11000 | 0.9525 | 0.7210 | 0.7308 | 0.7210 | 0.7013 |
82
+ | 0.5520 | 0.3846 | 11500 | 0.8647 | 0.7645 | 0.7695 | 0.7645 | 0.7391 |
83
+ | 0.5552 | 0.4013 | 12000 | 0.8977 | 0.7536 | 0.7698 | 0.7536 | 0.7358 |
84
+ | 0.5341 | 0.4180 | 12500 | 0.8526 | 0.7536 | 0.7625 | 0.7536 | 0.7305 |
85
+ | 0.5284 | 0.4347 | 13000 | 0.8496 | 0.7464 | 0.7310 | 0.7464 | 0.7166 |
86
+ | 0.5322 | 0.4514 | 13500 | 0.7672 | 0.8007 | 0.8006 | 0.8007 | 0.7827 |
87
+ | 0.5229 | 0.4681 | 14000 | 0.8253 | 0.7754 | 0.7698 | 0.7754 | 0.7515 |
88
+ | 0.5007 | 0.4849 | 14500 | 0.8496 | 0.7826 | 0.7649 | 0.7826 | 0.7547 |
89
+ | 0.5109 | 0.5016 | 15000 | 0.7700 | 0.7754 | 0.7767 | 0.7754 | 0.7518 |
90
+ | 0.4989 | 0.5183 | 15500 | 0.8338 | 0.7645 | 0.7741 | 0.7645 | 0.7419 |
91
+ | 0.4991 | 0.5350 | 16000 | 0.7927 | 0.7754 | 0.7928 | 0.7754 | 0.7625 |
92
+ | 0.4977 | 0.5517 | 16500 | 0.7859 | 0.7790 | 0.7670 | 0.7790 | 0.7551 |
93
+ | 0.4854 | 0.5685 | 17000 | 0.7915 | 0.7862 | 0.7907 | 0.7862 | 0.7630 |
94
+ | 0.4826 | 0.5852 | 17500 | 0.7628 | 0.8043 | 0.7964 | 0.8043 | 0.7846 |
95
+ | 0.4765 | 0.6019 | 18000 | 0.7632 | 0.7971 | 0.8008 | 0.7971 | 0.7791 |
96
+ | 0.4641 | 0.6186 | 18500 | 0.7722 | 0.7935 | 0.7660 | 0.7935 | 0.7670 |
97
+ | 0.4783 | 0.6353 | 19000 | 0.7046 | 0.7899 | 0.8111 | 0.7899 | 0.7773 |
98
+ | 0.4745 | 0.6521 | 19500 | 0.7342 | 0.7899 | 0.8044 | 0.7899 | 0.7726 |
99
+ | 0.4555 | 0.6688 | 20000 | 0.7116 | 0.7862 | 0.7853 | 0.7862 | 0.7662 |
100
+ | 0.4530 | 0.6855 | 20500 | 0.7385 | 0.7754 | 0.7658 | 0.7754 | 0.7557 |
101
+ | 0.4565 | 0.7022 | 21000 | 0.7651 | 0.7899 | 0.8132 | 0.7899 | 0.7770 |
102
+ | 0.4555 | 0.7189 | 21500 | 0.7902 | 0.7681 | 0.7812 | 0.7681 | 0.7569 |
103
+ | 0.4485 | 0.7357 | 22000 | 0.7613 | 0.7862 | 0.7962 | 0.7862 | 0.7686 |
104
+ | 0.4518 | 0.7524 | 22500 | 0.7544 | 0.7862 | 0.7944 | 0.7862 | 0.7676 |
105
+ | 0.4508 | 0.7691 | 23000 | 0.7296 | 0.8043 | 0.8110 | 0.8043 | 0.7907 |
106
+ | 0.4418 | 0.7858 | 23500 | 0.7293 | 0.8261 | 0.8527 | 0.8261 | 0.8137 |
107
+ | 0.4365 | 0.8025 | 24000 | 0.7370 | 0.8043 | 0.8217 | 0.8043 | 0.7928 |
108
+ | 0.4353 | 0.8193 | 24500 | 0.7100 | 0.8188 | 0.8274 | 0.8188 | 0.8049 |
109
+ | 0.4240 | 0.8360 | 25000 | 0.7273 | 0.7862 | 0.7857 | 0.7862 | 0.7697 |
110
+ | 0.4205 | 0.8527 | 25500 | 0.7297 | 0.8225 | 0.8351 | 0.8225 | 0.8059 |
111
+ | 0.4316 | 0.8694 | 26000 | 0.7204 | 0.8116 | 0.8066 | 0.8116 | 0.7911 |
112
+ | 0.4176 | 0.8861 | 26500 | 0.7340 | 0.8080 | 0.8184 | 0.8080 | 0.7922 |
113
+ | 0.4240 | 0.9029 | 27000 | 0.7298 | 0.8116 | 0.8223 | 0.8116 | 0.7964 |
114
+ | 0.4149 | 0.9196 | 27500 | 0.7410 | 0.8188 | 0.8185 | 0.8188 | 0.8023 |
115
+ | 0.4159 | 0.9363 | 28000 | 0.7303 | 0.8152 | 0.8388 | 0.8152 | 0.8069 |
116
+ | 0.4068 | 0.9530 | 28500 | 0.7220 | 0.8043 | 0.8209 | 0.8043 | 0.7955 |
117
+ | 0.4135 | 0.9697 | 29000 | 0.7313 | 0.8188 | 0.8238 | 0.8188 | 0.8055 |
118
+ | 0.4130 | 0.9865 | 29500 | 0.7221 | 0.8225 | 0.8320 | 0.8225 | 0.8095 |
119
+ | 0.4213 | 1.0032 | 30000 | 0.7212 | 0.8297 | 0.8471 | 0.8297 | 0.8191 |
120
+
121
+
122
+ ### Framework versions
123
+
124
+ - Transformers 5.8.0
125
+ - Pytorch 2.11.0+cu130
126
+ - Datasets 4.8.5
127
+ - Tokenizers 0.22.2
config.json ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "T5ForSequenceClassification"
4
+ ],
5
+ "classifier_dropout": 0.0,
6
+ "d_ff": 1024,
7
+ "d_kv": 64,
8
+ "d_model": 256,
9
+ "decoder_start_token_id": 0,
10
+ "dense_act_fn": "relu",
11
+ "dropout_rate": 0.1,
12
+ "dtype": "float32",
13
+ "eos_token_id": 1,
14
+ "feed_forward_proj": "relu",
15
+ "id2label": {
16
+ "0": "ach",
17
+ "1": "adh",
18
+ "2": "afr",
19
+ "3": "aka",
20
+ "4": "alz",
21
+ "5": "amh",
22
+ "6": "bam",
23
+ "7": "bem",
24
+ "8": "ber",
25
+ "9": "bfa",
26
+ "10": "cgg",
27
+ "11": "dag",
28
+ "12": "dga",
29
+ "13": "din",
30
+ "14": "eng",
31
+ "15": "ewe",
32
+ "16": "fra",
33
+ "17": "ful",
34
+ "18": "gwr",
35
+ "19": "hau",
36
+ "20": "ibo",
37
+ "21": "ikx",
38
+ "22": "kab",
39
+ "23": "kau",
40
+ "24": "kdi",
41
+ "25": "kdj",
42
+ "26": "keo",
43
+ "27": "kik",
44
+ "28": "kin",
45
+ "29": "koo",
46
+ "30": "kpz",
47
+ "31": "laj",
48
+ "32": "led",
49
+ "33": "lgg",
50
+ "34": "lin",
51
+ "35": "lsm",
52
+ "36": "luc",
53
+ "37": "lug",
54
+ "38": "luo",
55
+ "39": "luy",
56
+ "40": "mhi",
57
+ "41": "mlg",
58
+ "42": "myx",
59
+ "43": "nbl",
60
+ "44": "nuj",
61
+ "45": "nya",
62
+ "46": "nyn",
63
+ "47": "nyo",
64
+ "48": "orm",
65
+ "49": "pcm",
66
+ "50": "pok",
67
+ "51": "rub",
68
+ "52": "ruc",
69
+ "53": "run",
70
+ "54": "rwm",
71
+ "55": "sna",
72
+ "56": "som",
73
+ "57": "sot",
74
+ "58": "swa",
75
+ "59": "teo",
76
+ "60": "tlj",
77
+ "61": "tsn",
78
+ "62": "ttj",
79
+ "63": "wol",
80
+ "64": "xho",
81
+ "65": "xog",
82
+ "66": "yor",
83
+ "67": "zul"
84
+ },
85
+ "initializer_factor": 1.0,
86
+ "is_decoder": false,
87
+ "is_encoder_decoder": true,
88
+ "is_gated_act": false,
89
+ "label2id": {
90
+ "ach": 0,
91
+ "adh": 1,
92
+ "afr": 2,
93
+ "aka": 3,
94
+ "alz": 4,
95
+ "amh": 5,
96
+ "bam": 6,
97
+ "bem": 7,
98
+ "ber": 8,
99
+ "bfa": 9,
100
+ "cgg": 10,
101
+ "dag": 11,
102
+ "dga": 12,
103
+ "din": 13,
104
+ "eng": 14,
105
+ "ewe": 15,
106
+ "fra": 16,
107
+ "ful": 17,
108
+ "gwr": 18,
109
+ "hau": 19,
110
+ "ibo": 20,
111
+ "ikx": 21,
112
+ "kab": 22,
113
+ "kau": 23,
114
+ "kdi": 24,
115
+ "kdj": 25,
116
+ "keo": 26,
117
+ "kik": 27,
118
+ "kin": 28,
119
+ "koo": 29,
120
+ "kpz": 30,
121
+ "laj": 31,
122
+ "led": 32,
123
+ "lgg": 33,
124
+ "lin": 34,
125
+ "lsm": 35,
126
+ "luc": 36,
127
+ "lug": 37,
128
+ "luo": 38,
129
+ "luy": 39,
130
+ "mhi": 40,
131
+ "mlg": 41,
132
+ "myx": 42,
133
+ "nbl": 43,
134
+ "nuj": 44,
135
+ "nya": 45,
136
+ "nyn": 46,
137
+ "nyo": 47,
138
+ "orm": 48,
139
+ "pcm": 49,
140
+ "pok": 50,
141
+ "rub": 51,
142
+ "ruc": 52,
143
+ "run": 53,
144
+ "rwm": 54,
145
+ "sna": 55,
146
+ "som": 56,
147
+ "sot": 57,
148
+ "swa": 58,
149
+ "teo": 59,
150
+ "tlj": 60,
151
+ "tsn": 61,
152
+ "ttj": 62,
153
+ "wol": 63,
154
+ "xho": 64,
155
+ "xog": 65,
156
+ "yor": 66,
157
+ "zul": 67
158
+ },
159
+ "layer_norm_epsilon": 1e-06,
160
+ "model_type": "t5",
161
+ "n_positions": 512,
162
+ "num_decoder_layers": 4,
163
+ "num_heads": 4,
164
+ "num_layers": 4,
165
+ "pad_token_id": 0,
166
+ "problem_type": "single_label_classification",
167
+ "relative_attention_max_distance": 128,
168
+ "relative_attention_num_buckets": 32,
169
+ "scale_decoder_outputs": true,
170
+ "tie_word_embeddings": true,
171
+ "transformers_version": "5.8.0",
172
+ "use_cache": false,
173
+ "vocab_size": 32128
174
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f2e5888ff7741e36c1f60cf0894267db00e3f69abbc7f540bc45918531dd55e
3
+ size 62627616
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "eos_token": "</s>",
4
+ "extra_ids": 100,
5
+ "extra_special_tokens": [
6
+ "<extra_id_0>",
7
+ "<extra_id_1>",
8
+ "<extra_id_2>",
9
+ "<extra_id_3>",
10
+ "<extra_id_4>",
11
+ "<extra_id_5>",
12
+ "<extra_id_6>",
13
+ "<extra_id_7>",
14
+ "<extra_id_8>",
15
+ "<extra_id_9>",
16
+ "<extra_id_10>",
17
+ "<extra_id_11>",
18
+ "<extra_id_12>",
19
+ "<extra_id_13>",
20
+ "<extra_id_14>",
21
+ "<extra_id_15>",
22
+ "<extra_id_16>",
23
+ "<extra_id_17>",
24
+ "<extra_id_18>",
25
+ "<extra_id_19>",
26
+ "<extra_id_20>",
27
+ "<extra_id_21>",
28
+ "<extra_id_22>",
29
+ "<extra_id_23>",
30
+ "<extra_id_24>",
31
+ "<extra_id_25>",
32
+ "<extra_id_26>",
33
+ "<extra_id_27>",
34
+ "<extra_id_28>",
35
+ "<extra_id_29>",
36
+ "<extra_id_30>",
37
+ "<extra_id_31>",
38
+ "<extra_id_32>",
39
+ "<extra_id_33>",
40
+ "<extra_id_34>",
41
+ "<extra_id_35>",
42
+ "<extra_id_36>",
43
+ "<extra_id_37>",
44
+ "<extra_id_38>",
45
+ "<extra_id_39>",
46
+ "<extra_id_40>",
47
+ "<extra_id_41>",
48
+ "<extra_id_42>",
49
+ "<extra_id_43>",
50
+ "<extra_id_44>",
51
+ "<extra_id_45>",
52
+ "<extra_id_46>",
53
+ "<extra_id_47>",
54
+ "<extra_id_48>",
55
+ "<extra_id_49>",
56
+ "<extra_id_50>",
57
+ "<extra_id_51>",
58
+ "<extra_id_52>",
59
+ "<extra_id_53>",
60
+ "<extra_id_54>",
61
+ "<extra_id_55>",
62
+ "<extra_id_56>",
63
+ "<extra_id_57>",
64
+ "<extra_id_58>",
65
+ "<extra_id_59>",
66
+ "<extra_id_60>",
67
+ "<extra_id_61>",
68
+ "<extra_id_62>",
69
+ "<extra_id_63>",
70
+ "<extra_id_64>",
71
+ "<extra_id_65>",
72
+ "<extra_id_66>",
73
+ "<extra_id_67>",
74
+ "<extra_id_68>",
75
+ "<extra_id_69>",
76
+ "<extra_id_70>",
77
+ "<extra_id_71>",
78
+ "<extra_id_72>",
79
+ "<extra_id_73>",
80
+ "<extra_id_74>",
81
+ "<extra_id_75>",
82
+ "<extra_id_76>",
83
+ "<extra_id_77>",
84
+ "<extra_id_78>",
85
+ "<extra_id_79>",
86
+ "<extra_id_80>",
87
+ "<extra_id_81>",
88
+ "<extra_id_82>",
89
+ "<extra_id_83>",
90
+ "<extra_id_84>",
91
+ "<extra_id_85>",
92
+ "<extra_id_86>",
93
+ "<extra_id_87>",
94
+ "<extra_id_88>",
95
+ "<extra_id_89>",
96
+ "<extra_id_90>",
97
+ "<extra_id_91>",
98
+ "<extra_id_92>",
99
+ "<extra_id_93>",
100
+ "<extra_id_94>",
101
+ "<extra_id_95>",
102
+ "<extra_id_96>",
103
+ "<extra_id_97>",
104
+ "<extra_id_98>",
105
+ "<extra_id_99>"
106
+ ],
107
+ "is_local": false,
108
+ "local_files_only": false,
109
+ "model_max_length": 1000000000000000019884624838656,
110
+ "pad_token": "<pad>",
111
+ "sp_model_kwargs": {},
112
+ "tokenizer_class": "T5Tokenizer",
113
+ "unk_token": "<unk>"
114
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a4e3aa797123ac167307658f133886f118137de20d3071fe677d42112b37086d
3
+ size 5329