maud-dr commited on
Commit
67b7480
·
verified ·
1 Parent(s): 3977ba0

End of training

Browse files
Files changed (2) hide show
  1. README.md +22 -22
  2. model.safetensors +1 -1
README.md CHANGED
@@ -9,21 +9,21 @@ metrics:
9
  - recall
10
  - f1
11
  model-index:
12
- - name: baseline_3-seed_42
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # baseline_3-seed_42
20
 
21
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 3.7534
24
- - Precision: 0.3541
25
- - Recall: 0.3913
26
- - F1: 0.3718
27
 
28
  ## Model description
29
 
@@ -45,7 +45,7 @@ The following hyperparameters were used during training:
45
  - learning_rate: 0.0003
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
48
- - seed: 42
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 15
@@ -54,21 +54,21 @@ The following hyperparameters were used during training:
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
57
- | 0.7008 | 1.0 | 447 | 0.7467 | 0.4473 | 1.0 | 0.6181 |
58
- | 0.6933 | 2.0 | 894 | 0.6963 | 0.45 | 0.9783 | 0.6164 |
59
- | 0.6772 | 3.0 | 1341 | 0.6932 | 0.4820 | 0.8261 | 0.6088 |
60
- | 0.6708 | 4.0 | 1788 | 0.7332 | 0.4637 | 0.7174 | 0.5633 |
61
- | 0.6384 | 5.0 | 2235 | 0.8510 | 0.4468 | 0.6087 | 0.5153 |
62
- | 0.6096 | 6.0 | 2682 | 0.9629 | 0.4371 | 0.4783 | 0.4567 |
63
- | 0.5706 | 7.0 | 3129 | 1.4994 | 0.4115 | 0.5978 | 0.4874 |
64
- | 0.5401 | 8.0 | 3576 | 1.6980 | 0.3933 | 0.5072 | 0.4430 |
65
- | 0.4697 | 9.0 | 4023 | 2.2275 | 0.3797 | 0.4348 | 0.4054 |
66
- | 0.4824 | 10.0 | 4470 | 2.5809 | 0.3933 | 0.4674 | 0.4272 |
67
- | 0.4379 | 11.0 | 4917 | 2.6967 | 0.3742 | 0.4420 | 0.4053 |
68
- | 0.4328 | 12.0 | 5364 | 2.9542 | 0.3683 | 0.4457 | 0.4033 |
69
- | 0.4116 | 13.0 | 5811 | 3.2770 | 0.3705 | 0.4094 | 0.3890 |
70
- | 0.3725 | 14.0 | 6258 | 3.3860 | 0.3543 | 0.3877 | 0.3702 |
71
- | 0.3673 | 15.0 | 6705 | 3.7534 | 0.3541 | 0.3913 | 0.3718 |
72
 
73
 
74
  ### Framework versions
 
9
  - recall
10
  - f1
11
  model-index:
12
+ - name: baseline_3-seed_123
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # baseline_3-seed_123
20
 
21
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 3.5330
24
+ - Precision: 0.3762
25
+ - Recall: 0.4239
26
+ - F1: 0.3986
27
 
28
  ## Model description
29
 
 
45
  - learning_rate: 0.0003
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
48
+ - seed: 123
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 15
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
57
+ | 0.7003 | 1.0 | 447 | 0.6781 | 0.5847 | 0.25 | 0.3503 |
58
+ | 0.6856 | 2.0 | 894 | 0.6851 | 0.4984 | 0.5580 | 0.5265 |
59
+ | 0.6807 | 3.0 | 1341 | 0.7132 | 0.4770 | 0.4891 | 0.4830 |
60
+ | 0.6466 | 4.0 | 1788 | 0.8288 | 0.4368 | 0.7391 | 0.5491 |
61
+ | 0.6199 | 5.0 | 2235 | 0.9322 | 0.3919 | 0.3877 | 0.3898 |
62
+ | 0.5555 | 6.0 | 2682 | 1.0183 | 0.4146 | 0.5362 | 0.4676 |
63
+ | 0.5314 | 7.0 | 3129 | 1.2989 | 0.4089 | 0.5688 | 0.4758 |
64
+ | 0.4984 | 8.0 | 3576 | 1.6869 | 0.3630 | 0.3986 | 0.3800 |
65
+ | 0.4826 | 9.0 | 4023 | 1.9650 | 0.3799 | 0.4529 | 0.4132 |
66
+ | 0.4688 | 10.0 | 4470 | 2.3726 | 0.3776 | 0.4022 | 0.3895 |
67
+ | 0.4172 | 11.0 | 4917 | 2.4798 | 0.3978 | 0.5145 | 0.4487 |
68
+ | 0.423 | 12.0 | 5364 | 2.8128 | 0.3827 | 0.4493 | 0.4133 |
69
+ | 0.4037 | 13.0 | 5811 | 3.0582 | 0.3863 | 0.4493 | 0.4154 |
70
+ | 0.3576 | 14.0 | 6258 | 3.2799 | 0.3830 | 0.4565 | 0.4165 |
71
+ | 0.3066 | 15.0 | 6705 | 3.5330 | 0.3762 | 0.4239 | 0.3986 |
72
 
73
 
74
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ac3cf8685d264d68572c3f57f4da514e0d9cae183ebfbefcf6eada39999ca193
3
  size 894020048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75bcc488910d473b3d807b016d0f8d3511d8cfd39266f97af61bd2e6b3df88ff
3
  size 894020048