ZON8955 commited on
Commit
d47ac11
·
verified ·
1 Parent(s): 3204f08

End of training

Browse files
Files changed (1) hide show
  1. README.md +23 -13
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model: hfl/chinese-macbert-base
5
  tags:
6
  - generated_from_trainer
7
  model-index:
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # classification_2
16
 
17
- This model is a fine-tuned version of [hfl/chinese-macbert-base](https://huggingface.co/hfl/chinese-macbert-base) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.2281
20
 
21
  ## Model description
22
 
@@ -35,26 +35,36 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 5e-05
39
  - train_batch_size: 4
40
  - eval_batch_size: 4
41
  - seed: 42
42
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
 
43
  - lr_scheduler_type: linear
44
- - num_epochs: 3
 
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
- | 0.5343 | 1.0 | 20 | 0.3772 |
51
- | 0.0912 | 2.0 | 40 | 0.6497 |
52
- | 0.0095 | 3.0 | 60 | 0.2281 |
 
 
 
 
 
 
 
53
 
54
 
55
  ### Framework versions
56
 
57
- - Transformers 4.51.3
58
- - Pytorch 2.6.0+cu124
59
- - Datasets 3.5.0
60
- - Tokenizers 0.21.1
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: bert-base-chinese
5
  tags:
6
  - generated_from_trainer
7
  model-index:
 
14
 
15
  # classification_2
16
 
17
+ This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.0260
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 2e-05
39
  - train_batch_size: 4
40
  - eval_batch_size: 4
41
  - seed: 42
42
+ - gradient_accumulation_steps: 8
43
+ - total_train_batch_size: 32
44
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: linear
46
+ - lr_scheduler_warmup_ratio: 0.1
47
+ - num_epochs: 10
48
 
49
  ### Training results
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:----:|:---------------:|
53
+ | No log | 1.0 | 3 | 0.6182 |
54
+ | No log | 2.0 | 6 | 0.5323 |
55
+ | No log | 3.0 | 9 | 0.3020 |
56
+ | 0.602 | 4.0 | 12 | 0.1746 |
57
+ | 0.602 | 5.0 | 15 | 0.1003 |
58
+ | 0.602 | 6.0 | 18 | 0.0594 |
59
+ | 0.1176 | 7.0 | 21 | 0.0400 |
60
+ | 0.1176 | 8.0 | 24 | 0.0310 |
61
+ | 0.1176 | 9.0 | 27 | 0.0273 |
62
+ | 0.0277 | 10.0 | 30 | 0.0260 |
63
 
64
 
65
  ### Framework versions
66
 
67
+ - Transformers 4.57.3
68
+ - Pytorch 2.9.0+cu126
69
+ - Datasets 4.0.0
70
+ - Tokenizers 0.22.1