Sandrro commited on
Commit
e8fe77e
·
1 Parent(s): 0a8134e

End of training

Browse files
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - generated_from_trainer
5
+ metrics:
6
+ - f1
7
+ model-index:
8
+ - name: text_to_subfunction_v7
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ # text_to_subfunction_v7
16
+
17
+ This model is a fine-tuned version of [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) on the None dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 2.9393
20
+ - F1: 0.4657
21
+
22
+ ## Model description
23
+
24
+ More information needed
25
+
26
+ ## Intended uses & limitations
27
+
28
+ More information needed
29
+
30
+ ## Training and evaluation data
31
+
32
+ More information needed
33
+
34
+ ## Training procedure
35
+
36
+ ### Training hyperparameters
37
+
38
+ The following hyperparameters were used during training:
39
+ - learning_rate: 5e-05
40
+ - train_batch_size: 4
41
+ - eval_batch_size: 4
42
+ - seed: 42
43
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: linear
45
+ - num_epochs: 15
46
+ - mixed_precision_training: Native AMP
47
+
48
+ ### Training results
49
+
50
+ | Training Loss | Epoch | Step | Validation Loss | F1 |
51
+ |:-------------:|:-----:|:-----:|:---------------:|:------:|
52
+ | 3.0953 | 1.0 | 4995 | 2.9647 | 0.1986 |
53
+ | 2.2212 | 2.0 | 9990 | 2.3916 | 0.3411 |
54
+ | 1.7716 | 3.0 | 14985 | 2.1448 | 0.3938 |
55
+ | 1.4083 | 4.0 | 19980 | 2.0778 | 0.4358 |
56
+ | 1.1092 | 5.0 | 24975 | 2.0726 | 0.4608 |
57
+ | 0.8501 | 6.0 | 29970 | 2.1499 | 0.4652 |
58
+ | 0.5973 | 7.0 | 34965 | 2.2423 | 0.4586 |
59
+ | 0.4056 | 8.0 | 39960 | 2.3822 | 0.4605 |
60
+ | 0.3375 | 9.0 | 44955 | 2.5109 | 0.4564 |
61
+ | 0.2773 | 10.0 | 49950 | 2.6337 | 0.4590 |
62
+ | 0.2134 | 11.0 | 54945 | 2.7191 | 0.4698 |
63
+ | 0.1712 | 12.0 | 59940 | 2.8171 | 0.4634 |
64
+ | 0.1061 | 13.0 | 64935 | 2.8741 | 0.4687 |
65
+ | 0.1533 | 14.0 | 69930 | 2.9266 | 0.4665 |
66
+ | 0.0837 | 15.0 | 74925 | 2.9393 | 0.4657 |
67
+
68
+
69
+ ### Framework versions
70
+
71
+ - Transformers 4.27.1
72
+ - Pytorch 2.1.0.dev20230414+cu117
73
+ - Datasets 2.9.0
74
+ - Tokenizers 0.13.3
logs/events.out.tfevents.1688632604.SERVER-509.13280.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cbbeb56eaeda1aa22385e4a65497a4ba58b2d55f9e937c19850788830fcf838f
3
- size 150639
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9cf8d35b4c8bb1e5778aa6877045d03aa21f2c855f358d88246f2fb1b2044791
3
+ size 150999
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:82c1a21972edf79c1b1d1ff0e0f12e3d6c6a0c358ed97272bf9728e27e68d891
3
  size 116995039
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2ef096fe7bbc5c12a80112bb0fb25b6e7cd102aee41a58ccd8d8d326eb12582
3
  size 116995039
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "do_basic_tokenize": true,
4
+ "do_lower_case": false,
5
+ "mask_token": "[MASK]",
6
+ "model_max_length": 2048,
7
+ "never_split": null,
8
+ "pad_token": "[PAD]",
9
+ "sep_token": "[SEP]",
10
+ "special_tokens_map_file": null,
11
+ "strip_accents": null,
12
+ "tokenize_chinese_chars": true,
13
+ "tokenizer_class": "BertTokenizer",
14
+ "unk_token": "[UNK]"
15
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff