stulcrad commited on
Commit
a780843
·
verified ·
1 Parent(s): 0aa918b

End of training

Browse files
README.md CHANGED
@@ -5,16 +5,12 @@ base_model: ufal/robeczech-base
5
  tags:
6
  - generated_from_trainer
7
  datasets:
8
- - stulcrad/CERED-2
9
  metrics:
10
  - accuracy
11
- - f1
12
- - recall
13
  model-index:
14
  - name: Robeczech-CERED2
15
  results: []
16
- language:
17
- - cs
18
  ---
19
 
20
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -24,14 +20,14 @@ should probably proofread and complete it, then remove this comment. -->
24
 
25
  This model is a fine-tuned version of [ufal/robeczech-base](https://huggingface.co/ufal/robeczech-base) on the generator dataset.
26
  It achieves the following results on the evaluation set:
27
- - Loss: 0.9004
28
- - Accuracy: 0.8819
29
- - Micro Precision: 0.8819
30
- - Micro Recall: 0.8819
31
- - Micro F1: 0.8819
32
- - Macro Precision: 0.8488
33
- - Macro Recall: 0.8326
34
- - Macro F1: 0.8379
35
 
36
  ## Model description
37
 
@@ -50,31 +46,32 @@ More information needed
50
  ### Training hyperparameters
51
 
52
  The following hyperparameters were used during training:
53
- - learning_rate: 0.0001
54
  - train_batch_size: 16
55
  - eval_batch_size: 16
56
  - seed: 42
57
  - gradient_accumulation_steps: 2
58
  - total_train_batch_size: 32
59
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
60
- - lr_scheduler_type: linear
61
- - lr_scheduler_warmup_steps: 1000
62
  - num_epochs: 10
 
63
 
64
  ### Training results
65
 
66
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Micro Precision | Micro Recall | Micro F1 | Macro Precision | Macro Recall | Macro F1 |
67
  |:-------------:|:------:|:------:|:---------------:|:--------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|
68
- | 0.5518 | 1.0000 | 11305 | 0.5227 | 0.8496 | 0.8496 | 0.8496 | 0.8496 | 0.8293 | 0.7648 | 0.7779 |
69
- | 0.4797 | 2.0 | 22611 | 0.4742 | 0.8623 | 0.8623 | 0.8623 | 0.8623 | 0.8191 | 0.8141 | 0.8052 |
70
- | 0.369 | 3.0000 | 33916 | 0.4886 | 0.8684 | 0.8684 | 0.8684 | 0.8684 | 0.8493 | 0.8094 | 0.8198 |
71
- | 0.308 | 4.0 | 45222 | 0.4829 | 0.8685 | 0.8685 | 0.8685 | 0.8685 | 0.8347 | 0.8231 | 0.8228 |
72
- | 0.2395 | 5.0000 | 56527 | 0.4928 | 0.8755 | 0.8755 | 0.8755 | 0.8755 | 0.8300 | 0.8326 | 0.8265 |
73
- | 0.1852 | 6.0 | 67833 | 0.5186 | 0.8799 | 0.8799 | 0.8799 | 0.8799 | 0.8528 | 0.8385 | 0.8401 |
74
- | 0.1353 | 7.0000 | 79138 | 0.5951 | 0.8809 | 0.8809 | 0.8809 | 0.8809 | 0.8419 | 0.8419 | 0.8377 |
75
- | 0.0945 | 8.0 | 90444 | 0.6848 | 0.8847 | 0.8847 | 0.8847 | 0.8847 | 0.8510 | 0.8478 | 0.8438 |
76
- | 0.0551 | 9.0000 | 101749 | 0.7723 | 0.8867 | 0.8867 | 0.8867 | 0.8867 | 0.8469 | 0.8440 | 0.8405 |
77
- | 0.0319 | 9.9996 | 113050 | 0.8430 | 0.8882 | 0.8882 | 0.8882 | 0.8882 | 0.8492 | 0.8487 | 0.8448 |
78
 
79
 
80
  ### Framework versions
@@ -82,4 +79,4 @@ The following hyperparameters were used during training:
82
  - Transformers 4.46.2
83
  - Pytorch 2.5.1+cu124
84
  - Datasets 3.1.0
85
- - Tokenizers 0.20.3
 
5
  tags:
6
  - generated_from_trainer
7
  datasets:
8
+ - generator
9
  metrics:
10
  - accuracy
 
 
11
  model-index:
12
  - name: Robeczech-CERED2
13
  results: []
 
 
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
20
 
21
  This model is a fine-tuned version of [ufal/robeczech-base](https://huggingface.co/ufal/robeczech-base) on the generator dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 1.1300
24
+ - Accuracy: 0.8985
25
+ - Micro Precision: 0.8985
26
+ - Micro Recall: 0.8985
27
+ - Micro F1: 0.8985
28
+ - Macro Precision: 0.8711
29
+ - Macro Recall: 0.8608
30
+ - Macro F1: 0.8632
31
 
32
  ## Model description
33
 
 
46
  ### Training hyperparameters
47
 
48
  The following hyperparameters were used during training:
49
+ - learning_rate: 1e-05
50
  - train_batch_size: 16
51
  - eval_batch_size: 16
52
  - seed: 42
53
  - gradient_accumulation_steps: 2
54
  - total_train_batch_size: 32
55
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
56
+ - lr_scheduler_type: cosine
57
+ - lr_scheduler_warmup_steps: 1500
58
  - num_epochs: 10
59
+ - label_smoothing_factor: 0.1
60
 
61
  ### Training results
62
 
63
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Micro Precision | Micro Recall | Micro F1 | Macro Precision | Macro Recall | Macro F1 |
64
  |:-------------:|:------:|:------:|:---------------:|:--------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|
65
+ | 1.1585 | 1.0000 | 11305 | 1.1208 | 0.8608 | 0.8608 | 0.8608 | 0.8608 | 0.8155 | 0.7878 | 0.7914 |
66
+ | 1.0617 | 2.0 | 22611 | 1.0567 | 0.8873 | 0.8873 | 0.8873 | 0.8873 | 0.8547 | 0.8428 | 0.8430 |
67
+ | 0.9804 | 3.0000 | 33916 | 1.0558 | 0.8900 | 0.8900 | 0.8900 | 0.8900 | 0.8546 | 0.8414 | 0.8438 |
68
+ | 0.9327 | 4.0 | 45222 | 1.0585 | 0.8920 | 0.8920 | 0.8920 | 0.8920 | 0.8557 | 0.8475 | 0.8483 |
69
+ | 0.8927 | 5.0000 | 56527 | 1.0820 | 0.8917 | 0.8917 | 0.8917 | 0.8917 | 0.8484 | 0.8499 | 0.8455 |
70
+ | 0.861 | 6.0 | 67833 | 1.0774 | 0.8982 | 0.8982 | 0.8982 | 0.8982 | 0.8596 | 0.8567 | 0.8545 |
71
+ | 0.8344 | 7.0000 | 79138 | 1.0987 | 0.8979 | 0.8979 | 0.8979 | 0.8979 | 0.8641 | 0.8558 | 0.8567 |
72
+ | 0.8222 | 8.0 | 90444 | 1.1113 | 0.8991 | 0.8991 | 0.8991 | 0.8991 | 0.8639 | 0.8544 | 0.8558 |
73
+ | 0.8096 | 9.0000 | 101749 | 1.1159 | 0.9001 | 0.9001 | 0.9001 | 0.9001 | 0.8584 | 0.8589 | 0.8552 |
74
+ | 0.8071 | 9.9996 | 113050 | 1.1176 | 0.8994 | 0.8994 | 0.8994 | 0.8994 | 0.8561 | 0.8577 | 0.8539 |
75
 
76
 
77
  ### Framework versions
 
79
  - Transformers 4.46.2
80
  - Pytorch 2.5.1+cu124
81
  - Datasets 3.1.0
82
+ - Tokenizers 0.20.3
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8247b40a616cc12f40ffab8dc6e14923b33d50ba56b479a9e183903a576c0ddc
3
  size 504532408
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3e0bdb9de9c80cb0df8773197590fd1aee7bb8ba1464e4649b248b791a39f34
3
  size 504532408
runs/Jul01_11-28-52_n30/events.out.tfevents.1751362137.n30.3629418.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:387b61d1e8fa2b3b280f4061f0ac2fb6c7bcf5c61fcb73076db5cef623e07fd7
3
+ size 65364
runs/Jul01_11-28-52_n30/events.out.tfevents.1751380515.n30.3629418.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17eab353e581726fa9bf9f36cfb0432c2f6e29436d64ec6b64fdf38c965fb811
3
+ size 757
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d586a8ca17d4e6ed9c5439e93cccba6929c21065519d7962a729e2c00e5b7242
3
  size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:05a75e87d1a13becf36da38f50338924ea1a55b1e73114f468d8f01f0d4c7de1
3
  size 5304