MSLars commited on
Commit
9175deb
·
verified ·
1 Parent(s): 377a605

Training in progress, epoch 1

Browse files
Files changed (17) hide show
  1. README.md +117 -0
  2. config.json +43 -0
  3. model.safetensors +3 -0
  4. runs/Jan28_17-25-03_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738081504.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.37035.0 +3 -0
  5. runs/Jan28_19-23-29_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738088609.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.52319.0 +3 -0
  6. runs/Jan30_10-07-45_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228066.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.242514.0 +3 -0
  7. runs/Jan30_10-08-53_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228134.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.243666.0 +3 -0
  8. runs/Jan30_10-09-24_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228165.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.243729.0 +3 -0
  9. runs/Jan30_10-13-43_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228424.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.244406.0 +3 -0
  10. runs/Jan30_10-21-27_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228888.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.244914.0 +3 -0
  11. runs/Jan30_10-26-12_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738229173.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.245109.0 +3 -0
  12. runs/Jan30_10-29-28_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738229369.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.245379.0 +3 -0
  13. special_tokens_map.json +7 -0
  14. tokenizer.json +0 -0
  15. tokenizer_config.json +59 -0
  16. training_args.bin +3 -0
  17. vocab.txt +0 -0
README.md ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: mit
4
+ base_model: deepset/gbert-large
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - precision
9
+ - recall
10
+ - f1
11
+ - accuracy
12
+ model-index:
13
+ - name: testner
14
+ results: []
15
+ ---
16
+
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
19
+
20
+ # testner
21
+
22
+ This model is a fine-tuned version of [deepset/gbert-large](https://huggingface.co/deepset/gbert-large) on the None dataset.
23
+ It achieves the following results on the evaluation set:
24
+ - Loss: 1.3944
25
+ - Precision: 0.2579
26
+ - Recall: 0.2364
27
+ - F1: 0.2467
28
+ - Accuracy: 0.8626
29
+
30
+ ## Model description
31
+
32
+ More information needed
33
+
34
+ ## Intended uses & limitations
35
+
36
+ More information needed
37
+
38
+ ## Training and evaluation data
39
+
40
+ More information needed
41
+
42
+ ## Training procedure
43
+
44
+ ### Training hyperparameters
45
+
46
+ The following hyperparameters were used during training:
47
+ - learning_rate: 2e-05
48
+ - train_batch_size: 8
49
+ - eval_batch_size: 8
50
+ - seed: 42
51
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
+ - lr_scheduler_type: linear
53
+ - lr_scheduler_warmup_steps: 100
54
+ - num_epochs: 50
55
+
56
+ ### Training results
57
+
58
+ | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
59
+ |:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
60
+ | No log | 1.0 | 229 | 0.5531 | 0.0305 | 0.0133 | 0.0185 | 0.8440 |
61
+ | No log | 2.0 | 458 | 0.5008 | 0.1281 | 0.1055 | 0.1157 | 0.8582 |
62
+ | 0.5648 | 3.0 | 687 | 0.5616 | 0.1521 | 0.1648 | 0.1582 | 0.8532 |
63
+ | 0.5648 | 4.0 | 916 | 0.5269 | 0.1665 | 0.2315 | 0.1937 | 0.8536 |
64
+ | 0.2466 | 5.0 | 1145 | 0.6401 | 0.1885 | 0.2073 | 0.1975 | 0.8551 |
65
+ | 0.2466 | 6.0 | 1374 | 0.6759 | 0.1944 | 0.2036 | 0.1989 | 0.8592 |
66
+ | 0.1155 | 7.0 | 1603 | 0.7172 | 0.1859 | 0.2206 | 0.2018 | 0.8559 |
67
+ | 0.1155 | 8.0 | 1832 | 0.8176 | 0.2 | 0.2194 | 0.2092 | 0.8555 |
68
+ | 0.0612 | 9.0 | 2061 | 0.8450 | 0.1904 | 0.2315 | 0.2090 | 0.8519 |
69
+ | 0.0612 | 10.0 | 2290 | 0.9029 | 0.1895 | 0.2048 | 0.1969 | 0.8535 |
70
+ | 0.0376 | 11.0 | 2519 | 0.9917 | 0.2097 | 0.2194 | 0.2145 | 0.8548 |
71
+ | 0.0376 | 12.0 | 2748 | 0.9464 | 0.2346 | 0.2485 | 0.2413 | 0.8609 |
72
+ | 0.0376 | 13.0 | 2977 | 1.0170 | 0.2295 | 0.2412 | 0.2352 | 0.8585 |
73
+ | 0.022 | 14.0 | 3206 | 0.9993 | 0.2259 | 0.2242 | 0.2251 | 0.8590 |
74
+ | 0.022 | 15.0 | 3435 | 1.0762 | 0.2194 | 0.2473 | 0.2325 | 0.8528 |
75
+ | 0.0152 | 16.0 | 3664 | 1.0343 | 0.2434 | 0.2364 | 0.2399 | 0.8616 |
76
+ | 0.0152 | 17.0 | 3893 | 1.0420 | 0.2241 | 0.2388 | 0.2312 | 0.8570 |
77
+ | 0.0137 | 18.0 | 4122 | 1.1025 | 0.2214 | 0.2206 | 0.2210 | 0.8610 |
78
+ | 0.0137 | 19.0 | 4351 | 1.0975 | 0.2186 | 0.2339 | 0.2260 | 0.8540 |
79
+ | 0.0099 | 20.0 | 4580 | 1.1521 | 0.2281 | 0.2436 | 0.2356 | 0.8592 |
80
+ | 0.0099 | 21.0 | 4809 | 1.1143 | 0.2080 | 0.2461 | 0.2254 | 0.8527 |
81
+ | 0.0084 | 22.0 | 5038 | 1.2333 | 0.2368 | 0.24 | 0.2384 | 0.8567 |
82
+ | 0.0084 | 23.0 | 5267 | 1.1713 | 0.2367 | 0.2364 | 0.2365 | 0.8595 |
83
+ | 0.0084 | 24.0 | 5496 | 1.2162 | 0.2599 | 0.2315 | 0.2449 | 0.8643 |
84
+ | 0.0065 | 25.0 | 5725 | 1.1444 | 0.2467 | 0.2473 | 0.2470 | 0.8600 |
85
+ | 0.0065 | 26.0 | 5954 | 1.2645 | 0.2512 | 0.2545 | 0.2529 | 0.8617 |
86
+ | 0.0046 | 27.0 | 6183 | 1.2562 | 0.2252 | 0.2255 | 0.2253 | 0.8610 |
87
+ | 0.0046 | 28.0 | 6412 | 1.2663 | 0.2516 | 0.2327 | 0.2418 | 0.8615 |
88
+ | 0.0043 | 29.0 | 6641 | 1.2686 | 0.2565 | 0.2497 | 0.2531 | 0.8622 |
89
+ | 0.0043 | 30.0 | 6870 | 1.2411 | 0.2342 | 0.2521 | 0.2428 | 0.8586 |
90
+ | 0.0037 | 31.0 | 7099 | 1.2620 | 0.2553 | 0.2485 | 0.2518 | 0.8626 |
91
+ | 0.0037 | 32.0 | 7328 | 1.3049 | 0.2506 | 0.24 | 0.2452 | 0.8593 |
92
+ | 0.003 | 33.0 | 7557 | 1.2796 | 0.2516 | 0.2339 | 0.2425 | 0.8633 |
93
+ | 0.003 | 34.0 | 7786 | 1.3039 | 0.2484 | 0.2339 | 0.2409 | 0.8625 |
94
+ | 0.0025 | 35.0 | 8015 | 1.3241 | 0.2597 | 0.2436 | 0.2514 | 0.8618 |
95
+ | 0.0025 | 36.0 | 8244 | 1.3132 | 0.2475 | 0.2436 | 0.2456 | 0.8613 |
96
+ | 0.0025 | 37.0 | 8473 | 1.3445 | 0.25 | 0.2388 | 0.2443 | 0.8620 |
97
+ | 0.002 | 38.0 | 8702 | 1.3669 | 0.2556 | 0.2339 | 0.2443 | 0.8635 |
98
+ | 0.002 | 39.0 | 8931 | 1.3566 | 0.2623 | 0.2448 | 0.2533 | 0.8622 |
99
+ | 0.0018 | 40.0 | 9160 | 1.3300 | 0.2447 | 0.2388 | 0.2417 | 0.8620 |
100
+ | 0.0018 | 41.0 | 9389 | 1.3311 | 0.2397 | 0.24 | 0.2399 | 0.8624 |
101
+ | 0.0019 | 42.0 | 9618 | 1.3368 | 0.2469 | 0.2412 | 0.2440 | 0.8625 |
102
+ | 0.0019 | 43.0 | 9847 | 1.3701 | 0.2430 | 0.2412 | 0.2421 | 0.8624 |
103
+ | 0.0014 | 44.0 | 10076 | 1.3941 | 0.2286 | 0.2327 | 0.2306 | 0.8619 |
104
+ | 0.0014 | 45.0 | 10305 | 1.3842 | 0.2506 | 0.2352 | 0.2427 | 0.8628 |
105
+ | 0.0013 | 46.0 | 10534 | 1.3827 | 0.2443 | 0.2327 | 0.2384 | 0.8619 |
106
+ | 0.0013 | 47.0 | 10763 | 1.3730 | 0.2506 | 0.2376 | 0.2439 | 0.8632 |
107
+ | 0.0013 | 48.0 | 10992 | 1.3936 | 0.2586 | 0.2364 | 0.2470 | 0.8629 |
108
+ | 0.0011 | 49.0 | 11221 | 1.3941 | 0.2634 | 0.2388 | 0.2505 | 0.8627 |
109
+ | 0.0011 | 50.0 | 11450 | 1.3944 | 0.2579 | 0.2364 | 0.2467 | 0.8626 |
110
+
111
+
112
+ ### Framework versions
113
+
114
+ - Transformers 4.48.1
115
+ - Pytorch 2.5.1
116
+ - Datasets 3.2.0
117
+ - Tokenizers 0.21.0
config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "deepset/gbert-large",
3
+ "architectures": [
4
+ "BertForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 1024,
11
+ "id2label": {
12
+ "0": "O",
13
+ "1": "B-semantic",
14
+ "2": "I-semantic",
15
+ "3": "B-grammar",
16
+ "4": "I-grammar",
17
+ "5": "B-non-sense",
18
+ "6": "I-non-sense"
19
+ },
20
+ "initializer_range": 0.02,
21
+ "intermediate_size": 4096,
22
+ "label2id": {
23
+ "B-grammar": 3,
24
+ "B-non-sense": 5,
25
+ "B-semantic": 1,
26
+ "I-grammar": 4,
27
+ "I-non-sense": 6,
28
+ "I-semantic": 2,
29
+ "O": 0
30
+ },
31
+ "layer_norm_eps": 1e-12,
32
+ "max_position_embeddings": 512,
33
+ "model_type": "bert",
34
+ "num_attention_heads": 16,
35
+ "num_hidden_layers": 24,
36
+ "pad_token_id": 0,
37
+ "position_embedding_type": "absolute",
38
+ "torch_dtype": "float32",
39
+ "transformers_version": "4.48.1",
40
+ "type_vocab_size": 2,
41
+ "use_cache": true,
42
+ "vocab_size": 31102
43
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b66b2f00d646dc90f9da12483e1dedf6b226fd30da08721bab00e4808847792
3
+ size 1338820348
runs/Jan28_17-25-03_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738081504.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.37035.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7205484c970bedbc2f2654dabb9b2f3d5f00a50d7ac9e69943c80e0452a5cb95
3
+ size 11467
runs/Jan28_19-23-29_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738088609.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.52319.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65fafbc6ddc29b6229e63c3383d4d0be509db445cb889a0b304ea8a5e564c382
3
+ size 34145
runs/Jan30_10-07-45_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228066.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.242514.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b158cb85fa6ba428c3cfd114ba6bdc7a2c10700f1f43b584c63d820c5fed966
3
+ size 5560
runs/Jan30_10-08-53_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228134.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.243666.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d03a20cead7be143d54a996dc256f9542dbaa76bfa02254d9656b385964bf91
3
+ size 5560
runs/Jan30_10-09-24_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228165.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.243729.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95f4fe10644c38cc10621703e000736b0999d535f4a780ed6ca10b7cb272bdb5
3
+ size 5560
runs/Jan30_10-13-43_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228424.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.244406.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d64fc5c6cbf98d091e13cee22e007785c995b899a0c546602e7ac94d0eb592b
3
+ size 9412
runs/Jan30_10-21-27_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738228888.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.244914.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ca77aa7bee0578129a42daeb6ddca736562ed7ac121f64b5974b2e31ec2f79b
3
+ size 5484
runs/Jan30_10-26-12_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738229173.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.245109.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42486937758f2d8d3f3073ddc842eea64b8167485f2a40a4ab23d3ef06f0e8da
3
+ size 6969
runs/Jan30_10-29-28_ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be/events.out.tfevents.1738229369.ptr-65mvotgovyrvuqjrx6z.18120a2.ip6.access.telenet.be.245379.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c5e847181648fea63349e19b110da86c485aac594d6b766ef047f569ca2d8496
3
+ size 6121
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "101": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "102": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "103": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": false,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_len": 512,
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_token": "[PAD]",
54
+ "sep_token": "[SEP]",
55
+ "strip_accents": false,
56
+ "tokenize_chinese_chars": true,
57
+ "tokenizer_class": "BertTokenizer",
58
+ "unk_token": "[UNK]"
59
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc6e0713c0ce29fabd9d2d679804ff3a1134661e0d83864d0b203e13dd7885b8
3
+ size 5496
vocab.txt ADDED
The diff for this file is too large to render. See raw diff