yazgisert commited on
Commit
c6e6285
·
verified ·
1 Parent(s): f75e8e7

End of training

Browse files
README.md CHANGED
@@ -14,8 +14,6 @@ should probably proofread and complete it, then remove this comment. -->
14
  # msa_prot_t5
15
 
16
  This model is a fine-tuned version of [Rostlab/prot_t5_xl_uniref50](https://huggingface.co/Rostlab/prot_t5_xl_uniref50) on an unknown dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 2.9147
19
 
20
  ## Model description
21
 
@@ -34,118 +32,17 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - learning_rate: 0.001
38
- - train_batch_size: 8
39
- - eval_batch_size: 8
40
  - seed: 42
41
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
42
  - lr_scheduler_type: linear
 
43
  - num_epochs: 100
44
 
45
  ### Training results
46
 
47
- | Training Loss | Epoch | Step | Validation Loss |
48
- |:-------------:|:-----:|:----:|:---------------:|
49
- | No log | 1.0 | 1 | 2.9866 |
50
- | No log | 2.0 | 2 | 3.1698 |
51
- | No log | 3.0 | 3 | 3.1083 |
52
- | No log | 4.0 | 4 | 2.9634 |
53
- | No log | 5.0 | 5 | 2.9300 |
54
- | No log | 6.0 | 6 | 3.2760 |
55
- | No log | 7.0 | 7 | 2.8957 |
56
- | No log | 8.0 | 8 | 2.9622 |
57
- | No log | 9.0 | 9 | 2.9933 |
58
- | No log | 10.0 | 10 | 2.8233 |
59
- | No log | 11.0 | 11 | 2.7758 |
60
- | No log | 12.0 | 12 | 2.9012 |
61
- | No log | 13.0 | 13 | 2.9936 |
62
- | No log | 14.0 | 14 | 2.9671 |
63
- | No log | 15.0 | 15 | 3.0426 |
64
- | No log | 16.0 | 16 | 2.9478 |
65
- | No log | 17.0 | 17 | 2.9906 |
66
- | No log | 18.0 | 18 | 2.9911 |
67
- | No log | 19.0 | 19 | 2.8639 |
68
- | No log | 20.0 | 20 | 2.8557 |
69
- | No log | 21.0 | 21 | 2.9053 |
70
- | No log | 22.0 | 22 | 2.8347 |
71
- | No log | 23.0 | 23 | 2.9482 |
72
- | No log | 24.0 | 24 | 2.9520 |
73
- | No log | 25.0 | 25 | 3.0104 |
74
- | No log | 26.0 | 26 | 2.8693 |
75
- | No log | 27.0 | 27 | 2.8381 |
76
- | No log | 28.0 | 28 | 2.8333 |
77
- | No log | 29.0 | 29 | 2.8644 |
78
- | No log | 30.0 | 30 | 3.0112 |
79
- | No log | 31.0 | 31 | 3.1106 |
80
- | No log | 32.0 | 32 | 3.0417 |
81
- | No log | 33.0 | 33 | 2.7188 |
82
- | No log | 34.0 | 34 | 2.9129 |
83
- | No log | 35.0 | 35 | 2.7620 |
84
- | No log | 36.0 | 36 | 2.7964 |
85
- | No log | 37.0 | 37 | 2.8155 |
86
- | No log | 38.0 | 38 | 2.9062 |
87
- | No log | 39.0 | 39 | 2.8137 |
88
- | No log | 40.0 | 40 | 2.9159 |
89
- | No log | 41.0 | 41 | 2.8783 |
90
- | No log | 42.0 | 42 | 3.1200 |
91
- | No log | 43.0 | 43 | 2.9103 |
92
- | No log | 44.0 | 44 | 2.9167 |
93
- | No log | 45.0 | 45 | 2.8640 |
94
- | No log | 46.0 | 46 | 2.7939 |
95
- | No log | 47.0 | 47 | 3.0191 |
96
- | No log | 48.0 | 48 | 2.8166 |
97
- | No log | 49.0 | 49 | 3.1344 |
98
- | 2.9463 | 50.0 | 50 | 3.0017 |
99
- | 2.9463 | 51.0 | 51 | 3.0631 |
100
- | 2.9463 | 52.0 | 52 | 2.6599 |
101
- | 2.9463 | 53.0 | 53 | 2.9787 |
102
- | 2.9463 | 54.0 | 54 | 2.7147 |
103
- | 2.9463 | 55.0 | 55 | 2.9215 |
104
- | 2.9463 | 56.0 | 56 | 2.8183 |
105
- | 2.9463 | 57.0 | 57 | 2.9195 |
106
- | 2.9463 | 58.0 | 58 | 2.9742 |
107
- | 2.9463 | 59.0 | 59 | 2.9367 |
108
- | 2.9463 | 60.0 | 60 | 2.8563 |
109
- | 2.9463 | 61.0 | 61 | 3.2135 |
110
- | 2.9463 | 62.0 | 62 | 2.9945 |
111
- | 2.9463 | 63.0 | 63 | 2.9708 |
112
- | 2.9463 | 64.0 | 64 | 2.8022 |
113
- | 2.9463 | 65.0 | 65 | 2.9473 |
114
- | 2.9463 | 66.0 | 66 | 2.9607 |
115
- | 2.9463 | 67.0 | 67 | 2.8410 |
116
- | 2.9463 | 68.0 | 68 | 2.8940 |
117
- | 2.9463 | 69.0 | 69 | 2.9710 |
118
- | 2.9463 | 70.0 | 70 | 3.0025 |
119
- | 2.9463 | 71.0 | 71 | 2.8677 |
120
- | 2.9463 | 72.0 | 72 | 2.8281 |
121
- | 2.9463 | 73.0 | 73 | 2.9339 |
122
- | 2.9463 | 74.0 | 74 | 2.9076 |
123
- | 2.9463 | 75.0 | 75 | 2.8363 |
124
- | 2.9463 | 76.0 | 76 | 2.9525 |
125
- | 2.9463 | 77.0 | 77 | 2.8536 |
126
- | 2.9463 | 78.0 | 78 | 2.8605 |
127
- | 2.9463 | 79.0 | 79 | 2.9587 |
128
- | 2.9463 | 80.0 | 80 | 2.9319 |
129
- | 2.9463 | 81.0 | 81 | 2.9245 |
130
- | 2.9463 | 82.0 | 82 | 2.8225 |
131
- | 2.9463 | 83.0 | 83 | 2.8640 |
132
- | 2.9463 | 84.0 | 84 | 2.8604 |
133
- | 2.9463 | 85.0 | 85 | 2.7640 |
134
- | 2.9463 | 86.0 | 86 | 2.9671 |
135
- | 2.9463 | 87.0 | 87 | 2.9539 |
136
- | 2.9463 | 88.0 | 88 | 2.9196 |
137
- | 2.9463 | 89.0 | 89 | 2.7831 |
138
- | 2.9463 | 90.0 | 90 | 2.8095 |
139
- | 2.9463 | 91.0 | 91 | 2.9585 |
140
- | 2.9463 | 92.0 | 92 | 2.8277 |
141
- | 2.9463 | 93.0 | 93 | 2.8445 |
142
- | 2.9463 | 94.0 | 94 | 2.9094 |
143
- | 2.9463 | 95.0 | 95 | 2.9313 |
144
- | 2.9463 | 96.0 | 96 | 2.8166 |
145
- | 2.9463 | 97.0 | 97 | 2.9152 |
146
- | 2.9463 | 98.0 | 98 | 2.8646 |
147
- | 2.9463 | 99.0 | 99 | 2.9297 |
148
- | 2.8987 | 100.0 | 100 | 2.9147 |
149
 
150
 
151
  ### Framework versions
 
14
  # msa_prot_t5
15
 
16
  This model is a fine-tuned version of [Rostlab/prot_t5_xl_uniref50](https://huggingface.co/Rostlab/prot_t5_xl_uniref50) on an unknown dataset.
 
 
17
 
18
  ## Model description
19
 
 
32
  ### Training hyperparameters
33
 
34
  The following hyperparameters were used during training:
35
+ - learning_rate: 1e-05
36
+ - train_batch_size: 16
37
+ - eval_batch_size: 16
38
  - seed: 42
39
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
40
  - lr_scheduler_type: linear
41
+ - lr_scheduler_warmup_steps: 100
42
  - num_epochs: 100
43
 
44
  ### Training results
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
 
48
  ### Framework versions
config.json CHANGED
@@ -2,13 +2,14 @@
2
  "architectures": [
3
  "T5ForConditionalGeneration"
4
  ],
 
5
  "classifier_dropout": 0.0,
6
  "d_ff": 16384,
7
  "d_kv": 128,
8
  "d_model": 1024,
9
  "decoder_start_token_id": 0,
10
  "dense_act_fn": "relu",
11
- "dropout_rate": 0.1,
12
  "eos_token_id": 1,
13
  "feed_forward_proj": "relu",
14
  "initializer_factor": 1.0,
 
2
  "architectures": [
3
  "T5ForConditionalGeneration"
4
  ],
5
+ "attention_dropout_rate": 0.0,
6
  "classifier_dropout": 0.0,
7
  "d_ff": 16384,
8
  "d_kv": 128,
9
  "d_model": 1024,
10
  "decoder_start_token_id": 0,
11
  "dense_act_fn": "relu",
12
+ "dropout_rate": 0.0,
13
  "eos_token_id": 1,
14
  "feed_forward_proj": "relu",
15
  "initializer_factor": 1.0,
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f162d653c5142578d55bccb255bda40b64ce74d9240ae2586f33f04a2dad1ee8
3
  size 4966822528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:927fc7ef55e0fd6efdf53072c2aa3570a5fb730391fca53363c3c6c067abc3d3
3
  size 4966822528
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4f5788242633fbc5c77b9af7047cbc1a1f3e6357cacdd9a0ca18132fcca83c50
3
  size 4999865056
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26395895b452debdb321c39cfa40cac9a0e0c261304826946d62a81543d3f58b
3
  size 4999865056
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6b28dd9e33826c7e116270180dcfb66859e516b129a0800c2e38520216e340b6
3
  size 1308696208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fdfb546a62cb257d570e7b59b2358ddecd663523c74abfa51775d0beab35e0b5
3
  size 1308696208
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:53a1f5ec7d74ab9a077b6be66ef01f7ef9be9118cf0d72c03647a448f167fbe7
3
- size 5713
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65838f4e72d5a9c6eb1f4289eb5a635f7b9022ab7ca7c6111fe8664384910734
3
+ size 5777