mikhail-panzo
/

fil_checkpoint

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

mikhail-panzo commited on Apr 12, 2024

Commit

3dbe8a2

·

verified ·

1 Parent(s): f642054

End of training

Files changed (1) hide show

README.md +9 -13

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mikhail-panzo/malay_full_checkpoint](https://huggingface.co/mikhail-panzo/malay_full_checkpoint) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3935
 ## Model description
@@ -34,7 +34,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0001
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
@@ -43,23 +43,19 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2000
-- training_steps: 5000
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.4871        | 6.76  | 500  | 0.4428          |
-| 0.4518        | 13.51 | 1000 | 0.4171          |
-| 0.4426        | 20.27 | 1500 | 0.4133          |
-| 0.4355        | 27.03 | 2000 | 0.4077          |
-| 0.425         | 33.78 | 2500 | 0.4013          |
-| 0.4167        | 40.54 | 3000 | 0.4020          |
-| 0.4037        | 47.3  | 3500 | 0.4051          |
-| 0.3933        | 54.05 | 4000 | 0.3945          |
-| 0.3875        | 60.81 | 4500 | 0.3928          |
-| 0.3828        | 67.57 | 5000 | 0.3935          |
 ### Framework versions

 This model is a fine-tuned version of [mikhail-panzo/malay_full_checkpoint](https://huggingface.co/mikhail-panzo/malay_full_checkpoint) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4735
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-06
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2000
+- training_steps: 3000
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.7579        | 6.76  | 500  | 0.7164          |
+| 0.6196        | 13.51 | 1000 | 0.5662          |
+| 0.5622        | 20.27 | 1500 | 0.5077          |
+| 0.5341        | 27.03 | 2000 | 0.4858          |
+| 0.52          | 33.78 | 2500 | 0.4772          |
+| 0.5233        | 40.54 | 3000 | 0.4735          |
 ### Framework versions