thenlpresearcher commited on
Commit
b1261bc
·
verified ·
1 Parent(s): 8cc40ef

thenlpresearcher/bert-punctuation-restoration-kaustubh

Browse files
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: bert-large-uncased
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - f1
9
+ - precision
10
+ - recall
11
+ model-index:
12
+ - name: bert_punct_model
13
+ results: []
14
+ ---
15
+
16
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
+ should probably proofread and complete it, then remove this comment. -->
18
+
19
+ # bert_punct_model
20
+
21
+ This model is a fine-tuned version of [bert-large-uncased](https://huggingface.co/bert-large-uncased) on the None dataset.
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 0.1454
24
+ - F1: 0.8223
25
+ - Precision: 0.8256
26
+ - Recall: 0.8190
27
+
28
+ ## Model description
29
+
30
+ More information needed
31
+
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
35
+
36
+ ## Training and evaluation data
37
+
38
+ More information needed
39
+
40
+ ## Training procedure
41
+
42
+ ### Training hyperparameters
43
+
44
+ The following hyperparameters were used during training:
45
+ - learning_rate: 5e-05
46
+ - train_batch_size: 16
47
+ - eval_batch_size: 16
48
+ - seed: 42
49
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
+ - lr_scheduler_type: linear
51
+ - num_epochs: 3
52
+ - mixed_precision_training: Native AMP
53
+
54
+ ### Training results
55
+
56
+ | Training Loss | Epoch | Step | Validation Loss | F1 | Precision | Recall |
57
+ |:-------------:|:------:|:-----:|:---------------:|:------:|:---------:|:------:|
58
+ | 0.2214 | 0.0388 | 500 | 0.1982 | 0.7598 | 0.7509 | 0.7690 |
59
+ | 0.1839 | 0.0776 | 1000 | 0.1660 | 0.7803 | 0.7938 | 0.7672 |
60
+ | 0.1741 | 0.1164 | 1500 | 0.1612 | 0.7849 | 0.8155 | 0.7566 |
61
+ | 0.1546 | 0.1553 | 2000 | 0.1631 | 0.7884 | 0.7757 | 0.8015 |
62
+ | 0.1575 | 0.1941 | 2500 | 0.1598 | 0.7864 | 0.7841 | 0.7887 |
63
+ | 0.1729 | 0.2329 | 3000 | 0.1551 | 0.7886 | 0.8045 | 0.7734 |
64
+ | 0.1463 | 0.2717 | 3500 | 0.1480 | 0.7912 | 0.7970 | 0.7854 |
65
+ | 0.1379 | 0.3105 | 4000 | 0.1446 | 0.7938 | 0.7994 | 0.7883 |
66
+ | 0.1491 | 0.3493 | 4500 | 0.1470 | 0.7971 | 0.8206 | 0.7748 |
67
+ | 0.1384 | 0.3881 | 5000 | 0.1411 | 0.7972 | 0.8148 | 0.7803 |
68
+ | 0.1455 | 0.4270 | 5500 | 0.1394 | 0.8036 | 0.8210 | 0.7869 |
69
+ | 0.1397 | 0.4658 | 6000 | 0.1419 | 0.8068 | 0.8274 | 0.7872 |
70
+ | 0.1433 | 0.5046 | 6500 | 0.1407 | 0.7974 | 0.8271 | 0.7697 |
71
+ | 0.135 | 0.5434 | 7000 | 0.1359 | 0.8065 | 0.8292 | 0.7850 |
72
+ | 0.1411 | 0.5822 | 7500 | 0.1446 | 0.8030 | 0.8164 | 0.7901 |
73
+ | 0.1415 | 0.6210 | 8000 | 0.1450 | 0.7994 | 0.8003 | 0.7985 |
74
+ | 0.1379 | 0.6598 | 8500 | 0.1441 | 0.8017 | 0.7915 | 0.8120 |
75
+ | 0.1399 | 0.6986 | 9000 | 0.1328 | 0.8116 | 0.8354 | 0.7891 |
76
+ | 0.132 | 0.7375 | 9500 | 0.1357 | 0.8029 | 0.8168 | 0.7894 |
77
+ | 0.1355 | 0.7763 | 10000 | 0.1367 | 0.8100 | 0.8248 | 0.7956 |
78
+ | 0.1342 | 0.8151 | 10500 | 0.1367 | 0.8087 | 0.8153 | 0.8022 |
79
+ | 0.1292 | 0.8539 | 11000 | 0.1344 | 0.8088 | 0.8164 | 0.8015 |
80
+ | 0.1301 | 0.8927 | 11500 | 0.1323 | 0.8194 | 0.8303 | 0.8088 |
81
+ | 0.1282 | 0.9315 | 12000 | 0.1319 | 0.8111 | 0.8249 | 0.7978 |
82
+ | 0.1265 | 0.9703 | 12500 | 0.1367 | 0.8120 | 0.8202 | 0.8040 |
83
+ | 0.1156 | 1.0092 | 13000 | 0.1354 | 0.8108 | 0.8137 | 0.8080 |
84
+ | 0.1068 | 1.0480 | 13500 | 0.1375 | 0.8176 | 0.8163 | 0.8190 |
85
+ | 0.1074 | 1.0868 | 14000 | 0.1357 | 0.8146 | 0.8123 | 0.8168 |
86
+ | 0.1011 | 1.1256 | 14500 | 0.1332 | 0.8131 | 0.8141 | 0.8120 |
87
+ | 0.1054 | 1.1644 | 15000 | 0.1364 | 0.8152 | 0.8096 | 0.8208 |
88
+ | 0.1069 | 1.2032 | 15500 | 0.1368 | 0.8174 | 0.8195 | 0.8153 |
89
+ | 0.1069 | 1.2420 | 16000 | 0.1359 | 0.8183 | 0.8231 | 0.8135 |
90
+ | 0.1047 | 1.2809 | 16500 | 0.1286 | 0.8210 | 0.8268 | 0.8153 |
91
+ | 0.1032 | 1.3197 | 17000 | 0.1315 | 0.8116 | 0.8082 | 0.8150 |
92
+ | 0.1021 | 1.3585 | 17500 | 0.1327 | 0.8108 | 0.8082 | 0.8135 |
93
+ | 0.1003 | 1.3973 | 18000 | 0.1315 | 0.8162 | 0.8171 | 0.8153 |
94
+ | 0.0965 | 1.4361 | 18500 | 0.1339 | 0.8136 | 0.8214 | 0.8058 |
95
+ | 0.0966 | 1.4749 | 19000 | 0.1308 | 0.8162 | 0.8204 | 0.8120 |
96
+ | 0.1034 | 1.5137 | 19500 | 0.1354 | 0.8127 | 0.8227 | 0.8029 |
97
+ | 0.1007 | 1.5526 | 20000 | 0.1317 | 0.8150 | 0.8155 | 0.8146 |
98
+ | 0.1056 | 1.5914 | 20500 | 0.1299 | 0.8142 | 0.8232 | 0.8055 |
99
+ | 0.0987 | 1.6302 | 21000 | 0.1332 | 0.8215 | 0.8320 | 0.8113 |
100
+ | 0.1019 | 1.6690 | 21500 | 0.1314 | 0.8214 | 0.8341 | 0.8091 |
101
+ | 0.1046 | 1.7078 | 22000 | 0.1289 | 0.8184 | 0.8287 | 0.8084 |
102
+ | 0.0966 | 1.7466 | 22500 | 0.1321 | 0.8216 | 0.8333 | 0.8102 |
103
+ | 0.1003 | 1.7854 | 23000 | 0.1279 | 0.8191 | 0.8260 | 0.8124 |
104
+ | 0.105 | 1.8243 | 23500 | 0.1302 | 0.8158 | 0.8260 | 0.8058 |
105
+ | 0.0976 | 1.8631 | 24000 | 0.1303 | 0.8178 | 0.8214 | 0.8142 |
106
+ | 0.0965 | 1.9019 | 24500 | 0.1267 | 0.8185 | 0.8258 | 0.8113 |
107
+ | 0.0966 | 1.9407 | 25000 | 0.1275 | 0.8222 | 0.8240 | 0.8204 |
108
+ | 0.099 | 1.9795 | 25500 | 0.1273 | 0.8222 | 0.8319 | 0.8128 |
109
+ | 0.0733 | 2.0183 | 26000 | 0.1439 | 0.8210 | 0.8250 | 0.8172 |
110
+ | 0.0765 | 2.0571 | 26500 | 0.1418 | 0.8172 | 0.8177 | 0.8168 |
111
+ | 0.0708 | 2.0959 | 27000 | 0.1443 | 0.8174 | 0.8211 | 0.8139 |
112
+ | 0.073 | 2.1348 | 27500 | 0.1429 | 0.8209 | 0.8265 | 0.8153 |
113
+ | 0.0787 | 2.1736 | 28000 | 0.1380 | 0.8178 | 0.8191 | 0.8164 |
114
+ | 0.0672 | 2.2124 | 28500 | 0.1423 | 0.8177 | 0.8242 | 0.8113 |
115
+ | 0.0694 | 2.2512 | 29000 | 0.1422 | 0.8185 | 0.8222 | 0.8150 |
116
+ | 0.0715 | 2.2900 | 29500 | 0.1473 | 0.8190 | 0.8172 | 0.8208 |
117
+ | 0.0724 | 2.3288 | 30000 | 0.1412 | 0.8182 | 0.8152 | 0.8212 |
118
+ | 0.0718 | 2.3676 | 30500 | 0.1429 | 0.8192 | 0.8213 | 0.8172 |
119
+ | 0.071 | 2.4065 | 31000 | 0.1427 | 0.8254 | 0.8294 | 0.8215 |
120
+ | 0.0734 | 2.4453 | 31500 | 0.1495 | 0.8225 | 0.8241 | 0.8208 |
121
+ | 0.0733 | 2.4841 | 32000 | 0.1423 | 0.8200 | 0.8262 | 0.8139 |
122
+ | 0.0658 | 2.5229 | 32500 | 0.1447 | 0.8212 | 0.8287 | 0.8139 |
123
+ | 0.0704 | 2.5617 | 33000 | 0.1443 | 0.8215 | 0.8293 | 0.8139 |
124
+ | 0.0683 | 2.6005 | 33500 | 0.1447 | 0.8226 | 0.8252 | 0.8201 |
125
+ | 0.0678 | 2.6393 | 34000 | 0.1464 | 0.8236 | 0.8268 | 0.8204 |
126
+ | 0.0673 | 2.6782 | 34500 | 0.1450 | 0.8239 | 0.8292 | 0.8186 |
127
+ | 0.0679 | 2.7170 | 35000 | 0.1471 | 0.8190 | 0.8215 | 0.8164 |
128
+ | 0.068 | 2.7558 | 35500 | 0.1475 | 0.8207 | 0.8299 | 0.8117 |
129
+ | 0.0676 | 2.7946 | 36000 | 0.1466 | 0.8196 | 0.8225 | 0.8168 |
130
+ | 0.0686 | 2.8334 | 36500 | 0.1441 | 0.8225 | 0.8272 | 0.8179 |
131
+ | 0.0677 | 2.8722 | 37000 | 0.1464 | 0.8222 | 0.8235 | 0.8208 |
132
+ | 0.0714 | 2.9110 | 37500 | 0.1456 | 0.8200 | 0.8218 | 0.8182 |
133
+ | 0.0679 | 2.9499 | 38000 | 0.1465 | 0.8218 | 0.8249 | 0.8186 |
134
+ | 0.0666 | 2.9887 | 38500 | 0.1454 | 0.8223 | 0.8256 | 0.8190 |
135
+
136
+
137
+ ### Framework versions
138
+
139
+ - Transformers 4.53.2
140
+ - Pytorch 2.4.0a0+f70bd71a48.nv24.06
141
+ - Datasets 3.6.0
142
+ - Tokenizers 0.21.4
config.json ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForTokenClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 1024,
11
+ "id2label": {
12
+ "0": "LABEL_0",
13
+ "1": "LABEL_1",
14
+ "2": "LABEL_2",
15
+ "3": "LABEL_3",
16
+ "4": "LABEL_4",
17
+ "5": "LABEL_5",
18
+ "6": "LABEL_6",
19
+ "7": "LABEL_7",
20
+ "8": "LABEL_8",
21
+ "9": "LABEL_9",
22
+ "10": "LABEL_10",
23
+ "11": "LABEL_11",
24
+ "12": "LABEL_12",
25
+ "13": "LABEL_13",
26
+ "14": "LABEL_14",
27
+ "15": "LABEL_15",
28
+ "16": "LABEL_16",
29
+ "17": "LABEL_17",
30
+ "18": "LABEL_18",
31
+ "19": "LABEL_19",
32
+ "20": "LABEL_20",
33
+ "21": "LABEL_21",
34
+ "22": "LABEL_22",
35
+ "23": "LABEL_23",
36
+ "24": "LABEL_24",
37
+ "25": "LABEL_25",
38
+ "26": "LABEL_26",
39
+ "27": "LABEL_27",
40
+ "28": "LABEL_28",
41
+ "29": "LABEL_29",
42
+ "30": "LABEL_30",
43
+ "31": "LABEL_31",
44
+ "32": "LABEL_32",
45
+ "33": "LABEL_33",
46
+ "34": "LABEL_34"
47
+ },
48
+ "initializer_range": 0.02,
49
+ "intermediate_size": 4096,
50
+ "label2id": {
51
+ "LABEL_0": 0,
52
+ "LABEL_1": 1,
53
+ "LABEL_10": 10,
54
+ "LABEL_11": 11,
55
+ "LABEL_12": 12,
56
+ "LABEL_13": 13,
57
+ "LABEL_14": 14,
58
+ "LABEL_15": 15,
59
+ "LABEL_16": 16,
60
+ "LABEL_17": 17,
61
+ "LABEL_18": 18,
62
+ "LABEL_19": 19,
63
+ "LABEL_2": 2,
64
+ "LABEL_20": 20,
65
+ "LABEL_21": 21,
66
+ "LABEL_22": 22,
67
+ "LABEL_23": 23,
68
+ "LABEL_24": 24,
69
+ "LABEL_25": 25,
70
+ "LABEL_26": 26,
71
+ "LABEL_27": 27,
72
+ "LABEL_28": 28,
73
+ "LABEL_29": 29,
74
+ "LABEL_3": 3,
75
+ "LABEL_30": 30,
76
+ "LABEL_31": 31,
77
+ "LABEL_32": 32,
78
+ "LABEL_33": 33,
79
+ "LABEL_34": 34,
80
+ "LABEL_4": 4,
81
+ "LABEL_5": 5,
82
+ "LABEL_6": 6,
83
+ "LABEL_7": 7,
84
+ "LABEL_8": 8,
85
+ "LABEL_9": 9
86
+ },
87
+ "layer_norm_eps": 1e-12,
88
+ "max_position_embeddings": 512,
89
+ "model_type": "bert",
90
+ "num_attention_heads": 16,
91
+ "num_hidden_layers": 24,
92
+ "pad_token_id": 0,
93
+ "position_embedding_type": "absolute",
94
+ "torch_dtype": "float32",
95
+ "transformers_version": "4.53.2",
96
+ "type_vocab_size": 2,
97
+ "use_cache": true,
98
+ "vocab_size": 30522
99
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69420afd29773f98970213232adec8d204b56a2782a8105663eee077bdc46ce0
3
+ size 1336559468
runs/Nov03_13-29-48_da1f5c763b14/events.out.tfevents.1762176590.da1f5c763b14.2503.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1b7745722c81da2a5dfa6f1f8a76a53eadf8d40cf1bd0571e1acdd7ec3d3fbf
3
+ size 6376
runs/Nov03_13-40-02_da1f5c763b14/events.out.tfevents.1762177203.da1f5c763b14.2503.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b647d79dd2fa26856d72db327e4dc12bd3526485e2164d3dba0faa37143c43f0
3
+ size 6376
runs/Nov03_13-41-51_da1f5c763b14/events.out.tfevents.1762177311.da1f5c763b14.2503.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:244dfa48870acf9929dd487ff1c8d538fabb1c54000f417429d0e8758b5b7368
3
+ size 6376
runs/Nov03_13-44-50_da1f5c763b14/events.out.tfevents.1762177490.da1f5c763b14.2503.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cb985cb05d2fa36610a53379035ff56a22f2f1ac91dc12f9d5272fb15b953e66
3
+ size 6376
runs/Nov03_13-45-28_da1f5c763b14/events.out.tfevents.1762177529.da1f5c763b14.2503.4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cebad3def353ca45c844e919f11cfa6135356571d6a5e2a70cd744b8d9b7ec40
3
+ size 24838
runs/Nov03_13-45-28_da1f5c763b14/events.out.tfevents.1762181280.da1f5c763b14.2503.5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85664abd88f423ece548996ae231335fa64dcd1db080b0bfc9b7c5b35da25db6
3
+ size 1720
runs/Nov22_08-33-36_ca85e5befb5e/events.out.tfevents.1763800416.ca85e5befb5e.50032.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3bbcceb892219d1cc5b24c26a1ec2e71a8a193b848ac1a08ebfe76032d287c88
3
+ size 6425
runs/Nov22_08-34-31_ca85e5befb5e/events.out.tfevents.1763800472.ca85e5befb5e.50032.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba19e5152b0009ef22f0ff6840c768eb5bac6d6ddc3cf8c479c61e30afe62c2b
3
+ size 6425
runs/Nov22_08-34-41_ca85e5befb5e/events.out.tfevents.1763800482.ca85e5befb5e.50032.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1594dcb83710a4ae99b55c79a5f30d6e09a1ed967210303274817ad2a7d5a260
3
+ size 6425
runs/Nov22_08-36-19_ca85e5befb5e/events.out.tfevents.1763800580.ca85e5befb5e.50389.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a6b473275ad683b655f6d256c544e308c490ba66d732470f21d402edc0df0fa
3
+ size 80656
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "extra_special_tokens": {},
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "pad_token": "[PAD]",
51
+ "sep_token": "[SEP]",
52
+ "strip_accents": null,
53
+ "tokenize_chinese_chars": true,
54
+ "tokenizer_class": "BertTokenizer",
55
+ "unk_token": "[UNK]"
56
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:412aa651c89c0fabc247a3ae4bf65afbd8823d2e71750517cdb4b32b0c847fa1
3
+ size 5304
vocab.txt ADDED
The diff for this file is too large to render. See raw diff