pt430187 commited on
Commit
6e92352
·
verified ·
1 Parent(s): 6ff41db

End of training

Browse files
Files changed (1) hide show
  1. README.md +65 -44
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  library_name: transformers
 
3
  tags:
4
  - generated_from_trainer
5
  model-index:
@@ -12,9 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # calculator_model_test
14
 
15
- This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 0.6665
18
 
19
  ## Model description
20
 
@@ -33,58 +34,78 @@ More information needed
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
- - learning_rate: 0.001
37
  - train_batch_size: 512
38
  - eval_batch_size: 512
39
  - seed: 42
40
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
41
  - lr_scheduler_type: linear
42
- - num_epochs: 40
43
 
44
  ### Training results
45
 
46
  | Training Loss | Epoch | Step | Validation Loss |
47
  |:-------------:|:-----:|:----:|:---------------:|
48
- | 3.3726 | 1.0 | 6 | 2.7166 |
49
- | 2.3654 | 2.0 | 12 | 1.9424 |
50
- | 1.8270 | 3.0 | 18 | 1.6749 |
51
- | 1.6250 | 4.0 | 24 | 1.5692 |
52
- | 1.5716 | 5.0 | 30 | 1.5094 |
53
- | 1.4919 | 6.0 | 36 | 1.4596 |
54
- | 1.4614 | 7.0 | 42 | 1.5419 |
55
- | 1.4853 | 8.0 | 48 | 1.5392 |
56
- | 1.4655 | 9.0 | 54 | 1.4410 |
57
- | 1.4404 | 10.0 | 60 | 1.3921 |
58
- | 1.3948 | 11.0 | 66 | 1.3351 |
59
- | 1.3285 | 12.0 | 72 | 1.3405 |
60
- | 1.3016 | 13.0 | 78 | 1.2448 |
61
- | 1.2413 | 14.0 | 84 | 1.1757 |
62
- | 1.1750 | 15.0 | 90 | 1.1599 |
63
- | 1.1504 | 16.0 | 96 | 1.0997 |
64
- | 1.1079 | 17.0 | 102 | 1.1055 |
65
- | 1.0869 | 18.0 | 108 | 1.0325 |
66
- | 1.0752 | 19.0 | 114 | 1.0473 |
67
- | 1.0535 | 20.0 | 120 | 0.9969 |
68
- | 0.9912 | 21.0 | 126 | 0.9888 |
69
- | 0.9841 | 22.0 | 132 | 0.9631 |
70
- | 0.9649 | 23.0 | 138 | 0.9573 |
71
- | 0.9536 | 24.0 | 144 | 0.9039 |
72
- | 0.9156 | 25.0 | 150 | 0.9099 |
73
- | 0.9011 | 26.0 | 156 | 0.8433 |
74
- | 0.8627 | 27.0 | 162 | 0.8299 |
75
- | 0.8545 | 28.0 | 168 | 0.8068 |
76
- | 0.8376 | 29.0 | 174 | 0.7701 |
77
- | 0.8088 | 30.0 | 180 | 0.7594 |
78
- | 0.7884 | 31.0 | 186 | 0.7428 |
79
- | 0.7944 | 32.0 | 192 | 0.7201 |
80
- | 0.7648 | 33.0 | 198 | 0.7043 |
81
- | 0.7528 | 34.0 | 204 | 0.6929 |
82
- | 0.7325 | 35.0 | 210 | 0.7123 |
83
- | 0.7678 | 36.0 | 216 | 0.7034 |
84
- | 0.7679 | 37.0 | 222 | 0.7037 |
85
- | 0.7540 | 38.0 | 228 | 0.6732 |
86
- | 0.7231 | 39.0 | 234 | 0.6759 |
87
- | 0.7391 | 40.0 | 240 | 0.6665 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
 
89
 
90
  ### Framework versions
 
1
  ---
2
  library_name: transformers
3
+ base_model: pt430187/calculator_model_test
4
  tags:
5
  - generated_from_trainer
6
  model-index:
 
13
 
14
  # calculator_model_test
15
 
16
+ This model is a fine-tuned version of [pt430187/calculator_model_test](https://huggingface.co/pt430187/calculator_model_test) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.6437
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - learning_rate: 1e-05
38
  - train_batch_size: 512
39
  - eval_batch_size: 512
40
  - seed: 42
41
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
42
  - lr_scheduler_type: linear
43
+ - num_epochs: 60
44
 
45
  ### Training results
46
 
47
  | Training Loss | Epoch | Step | Validation Loss |
48
  |:-------------:|:-----:|:----:|:---------------:|
49
+ | 0.7873 | 1.0 | 6 | 0.7304 |
50
+ | 0.7842 | 2.0 | 12 | 0.7200 |
51
+ | 0.7695 | 3.0 | 18 | 0.7135 |
52
+ | 0.7704 | 4.0 | 24 | 0.7096 |
53
+ | 0.7592 | 5.0 | 30 | 0.7038 |
54
+ | 0.7514 | 6.0 | 36 | 0.7007 |
55
+ | 0.7428 | 7.0 | 42 | 0.6986 |
56
+ | 0.7328 | 8.0 | 48 | 0.6965 |
57
+ | 0.7498 | 9.0 | 54 | 0.6941 |
58
+ | 0.7548 | 10.0 | 60 | 0.6917 |
59
+ | 0.7523 | 11.0 | 66 | 0.6896 |
60
+ | 0.7469 | 12.0 | 72 | 0.6866 |
61
+ | 0.7519 | 13.0 | 78 | 0.6841 |
62
+ | 0.7429 | 14.0 | 84 | 0.6830 |
63
+ | 0.7311 | 15.0 | 90 | 0.6804 |
64
+ | 0.7241 | 16.0 | 96 | 0.6775 |
65
+ | 0.7400 | 17.0 | 102 | 0.6757 |
66
+ | 0.7224 | 18.0 | 108 | 0.6745 |
67
+ | 0.7311 | 19.0 | 114 | 0.6741 |
68
+ | 0.7377 | 20.0 | 120 | 0.6726 |
69
+ | 0.7249 | 21.0 | 126 | 0.6703 |
70
+ | 0.7326 | 22.0 | 132 | 0.6688 |
71
+ | 0.7181 | 23.0 | 138 | 0.6687 |
72
+ | 0.7384 | 24.0 | 144 | 0.6673 |
73
+ | 0.7146 | 25.0 | 150 | 0.6649 |
74
+ | 0.7232 | 26.0 | 156 | 0.6637 |
75
+ | 0.7190 | 27.0 | 162 | 0.6619 |
76
+ | 0.7250 | 28.0 | 168 | 0.6599 |
77
+ | 0.7236 | 29.0 | 174 | 0.6593 |
78
+ | 0.7261 | 30.0 | 180 | 0.6607 |
79
+ | 0.7203 | 31.0 | 186 | 0.6592 |
80
+ | 0.7278 | 32.0 | 192 | 0.6568 |
81
+ | 0.7066 | 33.0 | 198 | 0.6555 |
82
+ | 0.7183 | 34.0 | 204 | 0.6544 |
83
+ | 0.7074 | 35.0 | 210 | 0.6536 |
84
+ | 0.7265 | 36.0 | 216 | 0.6534 |
85
+ | 0.7120 | 37.0 | 222 | 0.6529 |
86
+ | 0.7215 | 38.0 | 228 | 0.6519 |
87
+ | 0.7147 | 39.0 | 234 | 0.6518 |
88
+ | 0.7211 | 40.0 | 240 | 0.6516 |
89
+ | 0.7143 | 41.0 | 246 | 0.6501 |
90
+ | 0.7069 | 42.0 | 252 | 0.6486 |
91
+ | 0.7063 | 43.0 | 258 | 0.6479 |
92
+ | 0.7090 | 44.0 | 264 | 0.6475 |
93
+ | 0.7055 | 45.0 | 270 | 0.6470 |
94
+ | 0.7021 | 46.0 | 276 | 0.6468 |
95
+ | 0.7142 | 47.0 | 282 | 0.6463 |
96
+ | 0.7211 | 48.0 | 288 | 0.6456 |
97
+ | 0.7098 | 49.0 | 294 | 0.6453 |
98
+ | 0.7150 | 50.0 | 300 | 0.6452 |
99
+ | 0.7147 | 51.0 | 306 | 0.6452 |
100
+ | 0.7114 | 52.0 | 312 | 0.6451 |
101
+ | 0.7076 | 53.0 | 318 | 0.6450 |
102
+ | 0.7286 | 54.0 | 324 | 0.6447 |
103
+ | 0.7008 | 55.0 | 330 | 0.6445 |
104
+ | 0.7004 | 56.0 | 336 | 0.6442 |
105
+ | 0.7087 | 57.0 | 342 | 0.6441 |
106
+ | 0.6944 | 58.0 | 348 | 0.6439 |
107
+ | 0.7045 | 59.0 | 354 | 0.6437 |
108
+ | 0.7096 | 60.0 | 360 | 0.6437 |
109
 
110
 
111
  ### Framework versions