End of training
Browse files
README.md
CHANGED
|
@@ -20,14 +20,14 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 20 |
|
| 21 |
This model is a fine-tuned version of [ufal/robeczech-base](https://huggingface.co/ufal/robeczech-base) on the generator dataset.
|
| 22 |
It achieves the following results on the evaluation set:
|
| 23 |
-
- Loss: 0.
|
| 24 |
-
- Accuracy: 0.
|
| 25 |
-
- Micro Precision: 0.
|
| 26 |
-
- Micro Recall: 0.
|
| 27 |
-
- Micro F1: 0.
|
| 28 |
-
- Macro Precision: 0.
|
| 29 |
-
- Macro Recall: 0.
|
| 30 |
-
- Macro F1: 0.
|
| 31 |
|
| 32 |
## Model description
|
| 33 |
|
|
@@ -46,14 +46,14 @@ More information needed
|
|
| 46 |
### Training hyperparameters
|
| 47 |
|
| 48 |
The following hyperparameters were used during training:
|
| 49 |
-
- learning_rate:
|
| 50 |
- train_batch_size: 16
|
| 51 |
- eval_batch_size: 16
|
| 52 |
- seed: 42
|
| 53 |
- gradient_accumulation_steps: 2
|
| 54 |
- total_train_batch_size: 32
|
| 55 |
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 56 |
-
- lr_scheduler_type:
|
| 57 |
- lr_scheduler_warmup_steps: 1000
|
| 58 |
- num_epochs: 10
|
| 59 |
|
|
@@ -61,16 +61,16 @@ The following hyperparameters were used during training:
|
|
| 61 |
|
| 62 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Micro Precision | Micro Recall | Micro F1 | Macro Precision | Macro Recall | Macro F1 |
|
| 63 |
|:-------------:|:------:|:------:|:---------------:|:--------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|
|
| 64 |
-
| 0.
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
| 67 |
-
| 0.
|
| 68 |
-
| 0.
|
| 69 |
-
| 0.
|
| 70 |
-
| 0.
|
| 71 |
-
| 0.
|
| 72 |
-
| 0.
|
| 73 |
-
| 0.
|
| 74 |
|
| 75 |
|
| 76 |
### Framework versions
|
|
|
|
| 20 |
|
| 21 |
This model is a fine-tuned version of [ufal/robeczech-base](https://huggingface.co/ufal/robeczech-base) on the generator dataset.
|
| 22 |
It achieves the following results on the evaluation set:
|
| 23 |
+
- Loss: 0.9004
|
| 24 |
+
- Accuracy: 0.8819
|
| 25 |
+
- Micro Precision: 0.8819
|
| 26 |
+
- Micro Recall: 0.8819
|
| 27 |
+
- Micro F1: 0.8819
|
| 28 |
+
- Macro Precision: 0.8488
|
| 29 |
+
- Macro Recall: 0.8326
|
| 30 |
+
- Macro F1: 0.8379
|
| 31 |
|
| 32 |
## Model description
|
| 33 |
|
|
|
|
| 46 |
### Training hyperparameters
|
| 47 |
|
| 48 |
The following hyperparameters were used during training:
|
| 49 |
+
- learning_rate: 0.0001
|
| 50 |
- train_batch_size: 16
|
| 51 |
- eval_batch_size: 16
|
| 52 |
- seed: 42
|
| 53 |
- gradient_accumulation_steps: 2
|
| 54 |
- total_train_batch_size: 32
|
| 55 |
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 56 |
+
- lr_scheduler_type: linear
|
| 57 |
- lr_scheduler_warmup_steps: 1000
|
| 58 |
- num_epochs: 10
|
| 59 |
|
|
|
|
| 61 |
|
| 62 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Micro Precision | Micro Recall | Micro F1 | Macro Precision | Macro Recall | Macro F1 |
|
| 63 |
|:-------------:|:------:|:------:|:---------------:|:--------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|
|
| 64 |
+
| 0.5518 | 1.0000 | 11305 | 0.5227 | 0.8496 | 0.8496 | 0.8496 | 0.8496 | 0.8293 | 0.7648 | 0.7779 |
|
| 65 |
+
| 0.4797 | 2.0 | 22611 | 0.4742 | 0.8623 | 0.8623 | 0.8623 | 0.8623 | 0.8191 | 0.8141 | 0.8052 |
|
| 66 |
+
| 0.369 | 3.0000 | 33916 | 0.4886 | 0.8684 | 0.8684 | 0.8684 | 0.8684 | 0.8493 | 0.8094 | 0.8198 |
|
| 67 |
+
| 0.308 | 4.0 | 45222 | 0.4829 | 0.8685 | 0.8685 | 0.8685 | 0.8685 | 0.8347 | 0.8231 | 0.8228 |
|
| 68 |
+
| 0.2395 | 5.0000 | 56527 | 0.4928 | 0.8755 | 0.8755 | 0.8755 | 0.8755 | 0.8300 | 0.8326 | 0.8265 |
|
| 69 |
+
| 0.1852 | 6.0 | 67833 | 0.5186 | 0.8799 | 0.8799 | 0.8799 | 0.8799 | 0.8528 | 0.8385 | 0.8401 |
|
| 70 |
+
| 0.1353 | 7.0000 | 79138 | 0.5951 | 0.8809 | 0.8809 | 0.8809 | 0.8809 | 0.8419 | 0.8419 | 0.8377 |
|
| 71 |
+
| 0.0945 | 8.0 | 90444 | 0.6848 | 0.8847 | 0.8847 | 0.8847 | 0.8847 | 0.8510 | 0.8478 | 0.8438 |
|
| 72 |
+
| 0.0551 | 9.0000 | 101749 | 0.7723 | 0.8867 | 0.8867 | 0.8867 | 0.8867 | 0.8469 | 0.8440 | 0.8405 |
|
| 73 |
+
| 0.0319 | 9.9996 | 113050 | 0.8430 | 0.8882 | 0.8882 | 0.8882 | 0.8882 | 0.8492 | 0.8487 | 0.8448 |
|
| 74 |
|
| 75 |
|
| 76 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 504532408
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8247b40a616cc12f40ffab8dc6e14923b33d50ba56b479a9e183903a576c0ddc
|
| 3 |
size 504532408
|
runs/Mar31_16-42-38_dgx10/events.out.tfevents.1743432162.dgx10.924525.2
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a85ba3ae830ba8c676057e0c7c93c3fb96b83a7debdb3b3a8c5e5f1b5b985ebf
|
| 3 |
+
size 65367
|
runs/Mar31_16-42-38_dgx10/events.out.tfevents.1743446781.dgx10.924525.3
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b4da223ede9b2d4547784384aa3ca787f5f31523fa11f7c2481357dc27762ef7
|
| 3 |
+
size 757
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d586a8ca17d4e6ed9c5439e93cccba6929c21065519d7962a729e2c00e5b7242
|
| 3 |
+
size 5304
|