maud-dr commited on
Commit
3142396
·
verified ·
1 Parent(s): a7b647e

End of training

Browse files
Files changed (2) hide show
  1. README.md +22 -22
  2. model.safetensors +1 -1
README.md CHANGED
@@ -9,21 +9,21 @@ metrics:
9
  - recall
10
  - f1
11
  model-index:
12
- - name: model_2_stage2-seed_123
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # model_2_stage2-seed_123
20
 
21
  This model is a fine-tuned version of [maud-dr/model_2_stage1](https://huggingface.co/maud-dr/model_2_stage1) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 2.7954
24
- - Precision: 0.6352
25
- - Recall: 0.7319
26
- - F1: 0.6801
27
 
28
  ## Model description
29
 
@@ -45,7 +45,7 @@ The following hyperparameters were used during training:
45
  - learning_rate: 0.0003
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
48
- - seed: 123
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 15
@@ -54,21 +54,21 @@ The following hyperparameters were used during training:
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
57
- | 0.5075 | 1.0 | 447 | 0.6591 | 0.6296 | 0.6159 | 0.6227 |
58
- | 0.4241 | 2.0 | 894 | 0.8133 | 0.6012 | 0.7536 | 0.6688 |
59
- | 0.3651 | 3.0 | 1341 | 0.8972 | 0.6092 | 0.7174 | 0.6589 |
60
- | 0.3389 | 4.0 | 1788 | 1.3235 | 0.6040 | 0.7572 | 0.6720 |
61
- | 0.2572 | 5.0 | 2235 | 1.2850 | 0.6378 | 0.7464 | 0.6878 |
62
- | 0.187 | 6.0 | 2682 | 1.4055 | 0.6114 | 0.7754 | 0.6837 |
63
- | 0.1456 | 7.0 | 3129 | 1.8037 | 0.6464 | 0.6558 | 0.6511 |
64
- | 0.1386 | 8.0 | 3576 | 1.8962 | 0.6181 | 0.6920 | 0.6530 |
65
- | 0.1003 | 9.0 | 4023 | 2.1076 | 0.6198 | 0.7029 | 0.6587 |
66
- | 0.0738 | 10.0 | 4470 | 2.4260 | 0.6463 | 0.7283 | 0.6848 |
67
- | 0.0233 | 11.0 | 4917 | 2.5047 | 0.6242 | 0.7464 | 0.6799 |
68
- | 0.0677 | 12.0 | 5364 | 2.6329 | 0.6238 | 0.7029 | 0.6610 |
69
- | 0.0249 | 13.0 | 5811 | 2.5839 | 0.6429 | 0.7174 | 0.6781 |
70
- | 0.0249 | 14.0 | 6258 | 2.7944 | 0.6347 | 0.7428 | 0.6845 |
71
- | 0.0228 | 15.0 | 6705 | 2.7954 | 0.6352 | 0.7319 | 0.6801 |
72
 
73
 
74
  ### Framework versions
 
9
  - recall
10
  - f1
11
  model-index:
12
+ - name: model_2_stage2-seed_2025
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # model_2_stage2-seed_2025
20
 
21
  This model is a fine-tuned version of [maud-dr/model_2_stage1](https://huggingface.co/maud-dr/model_2_stage1) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 2.6600
24
+ - Precision: 0.6346
25
+ - Recall: 0.7174
26
+ - F1: 0.6735
27
 
28
  ## Model description
29
 
 
45
  - learning_rate: 0.0003
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
48
+ - seed: 2025
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 15
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
57
+ | 0.2914 | 1.0 | 447 | 1.5544 | 0.5815 | 0.7754 | 0.6646 |
58
+ | 0.2523 | 2.0 | 894 | 1.6443 | 0.6469 | 0.6703 | 0.6584 |
59
+ | 0.16 | 3.0 | 1341 | 1.8783 | 0.6144 | 0.6812 | 0.6460 |
60
+ | 0.1354 | 4.0 | 1788 | 1.5711 | 0.6287 | 0.7790 | 0.6958 |
61
+ | 0.1321 | 5.0 | 2235 | 1.7032 | 0.6607 | 0.6703 | 0.6655 |
62
+ | 0.1108 | 6.0 | 2682 | 1.9982 | 0.6144 | 0.6812 | 0.6460 |
63
+ | 0.103 | 7.0 | 3129 | 2.2463 | 0.6146 | 0.6993 | 0.6542 |
64
+ | 0.0778 | 8.0 | 3576 | 2.3003 | 0.6304 | 0.6920 | 0.6598 |
65
+ | 0.0428 | 9.0 | 4023 | 2.6554 | 0.6226 | 0.6993 | 0.6587 |
66
+ | 0.0589 | 10.0 | 4470 | 2.4618 | 0.6237 | 0.6667 | 0.6445 |
67
+ | 0.046 | 11.0 | 4917 | 2.5882 | 0.6242 | 0.7101 | 0.6644 |
68
+ | 0.0311 | 12.0 | 5364 | 2.5561 | 0.6321 | 0.7283 | 0.6768 |
69
+ | 0.0288 | 13.0 | 5811 | 2.6707 | 0.6410 | 0.7246 | 0.6803 |
70
+ | 0.0296 | 14.0 | 6258 | 2.6000 | 0.6343 | 0.7101 | 0.6701 |
71
+ | 0.0002 | 15.0 | 6705 | 2.6600 | 0.6346 | 0.7174 | 0.6735 |
72
 
73
 
74
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4781f1f99cd9fc34133949dd4639cdec472382438d6e0001e08d4f63ec3262c6
3
  size 894020048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d05e3f87dbaba9506739173909dbf483ea55ebd18081b1d19d9c1e43702379e
3
  size 894020048