maud-dr
/

baseline_3-seed_123

@@ -9,21 +9,21 @@ metrics:
 - recall
 - f1
 model-index:
-- name: baseline_3-seed_42
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# baseline_3-seed_42
 This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.7534
-- Precision: 0.3541
-- Recall: 0.3913
-- F1: 0.3718
 ## Model description
@@ -45,7 +45,7 @@ The following hyperparameters were used during training:
 - learning_rate: 0.0003
 - train_batch_size: 8
 - eval_batch_size: 8
-- seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 15
@@ -54,21 +54,21 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
-| 0.7008        | 1.0   | 447  | 0.7467          | 0.4473    | 1.0    | 0.6181 |
-| 0.6933        | 2.0   | 894  | 0.6963          | 0.45      | 0.9783 | 0.6164 |
-| 0.6772        | 3.0   | 1341 | 0.6932          | 0.4820    | 0.8261 | 0.6088 |
-| 0.6708        | 4.0   | 1788 | 0.7332          | 0.4637    | 0.7174 | 0.5633 |
-| 0.6384        | 5.0   | 2235 | 0.8510          | 0.4468    | 0.6087 | 0.5153 |
-| 0.6096        | 6.0   | 2682 | 0.9629          | 0.4371    | 0.4783 | 0.4567 |
-| 0.5706        | 7.0   | 3129 | 1.4994          | 0.4115    | 0.5978 | 0.4874 |
-| 0.5401        | 8.0   | 3576 | 1.6980          | 0.3933    | 0.5072 | 0.4430 |
-| 0.4697        | 9.0   | 4023 | 2.2275          | 0.3797    | 0.4348 | 0.4054 |
-| 0.4824        | 10.0  | 4470 | 2.5809          | 0.3933    | 0.4674 | 0.4272 |
-| 0.4379        | 11.0  | 4917 | 2.6967          | 0.3742    | 0.4420 | 0.4053 |
-| 0.4328        | 12.0  | 5364 | 2.9542          | 0.3683    | 0.4457 | 0.4033 |
-| 0.4116        | 13.0  | 5811 | 3.2770          | 0.3705    | 0.4094 | 0.3890 |
-| 0.3725        | 14.0  | 6258 | 3.3860          | 0.3543    | 0.3877 | 0.3702 |
-| 0.3673        | 15.0  | 6705 | 3.7534          | 0.3541    | 0.3913 | 0.3718 |
 ### Framework versions

 - recall
 - f1
 model-index:
+- name: baseline_3-seed_123
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# baseline_3-seed_123
 This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.5330
+- Precision: 0.3762
+- Recall: 0.4239
+- F1: 0.3986
 ## Model description
 - learning_rate: 0.0003
 - train_batch_size: 8
 - eval_batch_size: 8
+- seed: 123
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 15
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
+| 0.7003        | 1.0   | 447  | 0.6781          | 0.5847    | 0.25   | 0.3503 |
+| 0.6856        | 2.0   | 894  | 0.6851          | 0.4984    | 0.5580 | 0.5265 |
+| 0.6807        | 3.0   | 1341 | 0.7132          | 0.4770    | 0.4891 | 0.4830 |
+| 0.6466        | 4.0   | 1788 | 0.8288          | 0.4368    | 0.7391 | 0.5491 |
+| 0.6199        | 5.0   | 2235 | 0.9322          | 0.3919    | 0.3877 | 0.3898 |
+| 0.5555        | 6.0   | 2682 | 1.0183          | 0.4146    | 0.5362 | 0.4676 |
+| 0.5314        | 7.0   | 3129 | 1.2989          | 0.4089    | 0.5688 | 0.4758 |
+| 0.4984        | 8.0   | 3576 | 1.6869          | 0.3630    | 0.3986 | 0.3800 |
+| 0.4826        | 9.0   | 4023 | 1.9650          | 0.3799    | 0.4529 | 0.4132 |
+| 0.4688        | 10.0  | 4470 | 2.3726          | 0.3776    | 0.4022 | 0.3895 |
+| 0.4172        | 11.0  | 4917 | 2.4798          | 0.3978    | 0.5145 | 0.4487 |
+| 0.423         | 12.0  | 5364 | 2.8128          | 0.3827    | 0.4493 | 0.4133 |
+| 0.4037        | 13.0  | 5811 | 3.0582          | 0.3863    | 0.4493 | 0.4154 |
+| 0.3576        | 14.0  | 6258 | 3.2799          | 0.3830    | 0.4565 | 0.4165 |
+| 0.3066        | 15.0  | 6705 | 3.5330          | 0.3762    | 0.4239 | 0.3986 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ac3cf8685d264d68572c3f57f4da514e0d9cae183ebfbefcf6eada39999ca193
 size 894020048

 version https://git-lfs.github.com/spec/v1
+oid sha256:75bcc488910d473b3d807b016d0f8d3511d8cfd39266f97af61bd2e6b3df88ff
 size 894020048