End of training
Browse files
README.md
CHANGED
|
@@ -44,33 +44,43 @@ The following hyperparameters were used during training:
|
|
| 44 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 45 |
- lr_scheduler_type: linear
|
| 46 |
- lr_scheduler_warmup_steps: 100
|
| 47 |
-
- training_steps:
|
| 48 |
- mixed_precision_training: Native AMP
|
| 49 |
|
| 50 |
### Training results
|
| 51 |
|
| 52 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 53 |
|:-------------:|:-----:|:----:|:---------------:|
|
| 54 |
-
| 0.
|
| 55 |
-
| 0.
|
| 56 |
-
| 0.523 |
|
| 57 |
-
| 0.
|
| 58 |
-
| 0.
|
| 59 |
-
| 0.
|
| 60 |
-
| 0.
|
| 61 |
-
| 0.
|
| 62 |
-
| 0.
|
| 63 |
-
| 0.
|
| 64 |
-
| 0.
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
| 67 |
-
| 0.
|
| 68 |
-
| 0.
|
| 69 |
-
| 0.
|
| 70 |
-
| 0.
|
| 71 |
-
| 0.
|
| 72 |
-
| 0.
|
| 73 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
|
| 76 |
### Framework versions
|
|
|
|
| 44 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 45 |
- lr_scheduler_type: linear
|
| 46 |
- lr_scheduler_warmup_steps: 100
|
| 47 |
+
- training_steps: 6000
|
| 48 |
- mixed_precision_training: Native AMP
|
| 49 |
|
| 50 |
### Training results
|
| 51 |
|
| 52 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 53 |
|:-------------:|:-----:|:----:|:---------------:|
|
| 54 |
+
| 0.6508 | 2.5 | 200 | 0.5943 |
|
| 55 |
+
| 0.5656 | 5.0 | 400 | 0.5341 |
|
| 56 |
+
| 0.523 | 7.5 | 600 | 0.5141 |
|
| 57 |
+
| 0.512 | 10.0 | 800 | 0.4860 |
|
| 58 |
+
| 0.4966 | 12.5 | 1000 | 0.4803 |
|
| 59 |
+
| 0.522 | 15.0 | 1200 | 0.4921 |
|
| 60 |
+
| 0.4775 | 17.5 | 1400 | 0.4678 |
|
| 61 |
+
| 0.4726 | 20.0 | 1600 | 0.5031 |
|
| 62 |
+
| 0.4623 | 22.5 | 1800 | 0.4611 |
|
| 63 |
+
| 0.4612 | 25.0 | 2000 | 0.4593 |
|
| 64 |
+
| 0.4526 | 27.5 | 2200 | 0.4753 |
|
| 65 |
+
| 0.4558 | 30.0 | 2400 | 0.4578 |
|
| 66 |
+
| 0.4468 | 32.5 | 2600 | 0.4620 |
|
| 67 |
+
| 0.4474 | 35.0 | 2800 | 0.4618 |
|
| 68 |
+
| 0.4394 | 37.5 | 3000 | 0.4589 |
|
| 69 |
+
| 0.4332 | 40.0 | 3200 | 0.4463 |
|
| 70 |
+
| 0.4382 | 42.5 | 3400 | 0.4456 |
|
| 71 |
+
| 0.4382 | 45.0 | 3600 | 0.4481 |
|
| 72 |
+
| 0.4283 | 47.5 | 3800 | 0.4435 |
|
| 73 |
+
| 0.4278 | 50.0 | 4000 | 0.4470 |
|
| 74 |
+
| 0.4281 | 52.5 | 4200 | 0.4484 |
|
| 75 |
+
| 0.4236 | 55.0 | 4400 | 0.4482 |
|
| 76 |
+
| 0.422 | 57.5 | 4600 | 0.4480 |
|
| 77 |
+
| 0.4271 | 60.0 | 4800 | 0.4477 |
|
| 78 |
+
| 0.4105 | 62.5 | 5000 | 0.4475 |
|
| 79 |
+
| 0.4121 | 65.0 | 5200 | 0.4502 |
|
| 80 |
+
| 0.4115 | 67.5 | 5400 | 0.4522 |
|
| 81 |
+
| 0.4081 | 70.0 | 5600 | 0.4561 |
|
| 82 |
+
| 0.4059 | 72.5 | 5800 | 0.4610 |
|
| 83 |
+
| 0.4048 | 75.0 | 6000 | 0.4576 |
|
| 84 |
|
| 85 |
|
| 86 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 577789320
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d377f5ec6cbab3084fed0f89b9d620a4e6c922c71eccc9fba00f523f4601baff
|
| 3 |
size 577789320
|
runs/Apr17_07-22-39_827dd6811cc0/events.out.tfevents.1744874569.827dd6811cc0.579.0
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bf4b50ff9f962357b78685d14458ca5043cdc151ace75a0c77c142391cdc4f52
|
| 3 |
+
size 65651
|