Mathildeholst commited on
Commit
5d49660
·
verified ·
1 Parent(s): aaec225

End of training

Browse files
Files changed (1) hide show
  1. README.md +10 -29
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 4.3129
20
 
21
  ## Model description
22
 
@@ -36,7 +36,7 @@ More information needed
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0005
39
- - train_batch_size: 2
40
  - eval_batch_size: 8
41
  - seed: 42
42
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
@@ -45,33 +45,14 @@ The following hyperparameters were used during training:
45
 
46
  ### Training results
47
 
48
- | Training Loss | Epoch | Step | Validation Loss |
49
- |:-------------:|:------:|:----:|:---------------:|
50
- | 2.6489 | 0.08 | 200 | 4.0413 |
51
- | 2.5755 | 0.16 | 400 | 4.1157 |
52
- | 2.6471 | 0.24 | 600 | 4.2025 |
53
- | 2.568 | 0.32 | 800 | 4.1837 |
54
- | 2.5392 | 0.4 | 1000 | 4.2556 |
55
- | 2.46 | 0.48 | 1200 | 4.2362 |
56
- | 2.2725 | 0.56 | 1400 | 4.2459 |
57
- | 2.326 | 0.64 | 1600 | 4.2492 |
58
- | 2.2857 | 0.72 | 1800 | 4.3178 |
59
- | 2.2538 | 0.8 | 2000 | 4.2604 |
60
- | 2.4349 | 0.88 | 2200 | 4.0120 |
61
- | 3.332 | 0.96 | 2400 | 3.9160 |
62
- | 2.442 | 1.04 | 2600 | 4.3653 |
63
- | 1.5446 | 1.12 | 2800 | 4.3994 |
64
- | 1.5495 | 1.2 | 3000 | 4.4141 |
65
- | 1.687 | 1.28 | 3200 | 4.4201 |
66
- | 1.706 | 1.3600 | 3400 | 4.3880 |
67
- | 1.6803 | 1.44 | 3600 | 4.4039 |
68
- | 1.7366 | 1.52 | 3800 | 4.2966 |
69
- | 1.7932 | 1.6 | 4000 | 4.3586 |
70
- | 1.7363 | 1.6800 | 4200 | 4.3449 |
71
- | 1.7873 | 1.76 | 4400 | 4.2599 |
72
- | 1.8261 | 1.8400 | 4600 | 4.2841 |
73
- | 1.7526 | 1.92 | 4800 | 4.3207 |
74
- | 1.8271 | 2.0 | 5000 | 4.3129 |
75
 
76
 
77
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 3.5012
20
 
21
  ## Model description
22
 
 
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0005
39
+ - train_batch_size: 8
40
  - eval_batch_size: 8
41
  - seed: 42
42
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
45
 
46
  ### Training results
47
 
48
+ | Training Loss | Epoch | Step | Validation Loss |
49
+ |:-------------:|:-----:|:----:|:---------------:|
50
+ | 3.1881 | 0.32 | 200 | 3.4100 |
51
+ | 2.9542 | 0.64 | 400 | 3.3867 |
52
+ | 2.8132 | 0.96 | 600 | 3.3382 |
53
+ | 1.885 | 1.28 | 800 | 3.4916 |
54
+ | 1.8423 | 1.6 | 1000 | 3.4918 |
55
+ | 1.8546 | 1.92 | 1200 | 3.5012 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
 
58
  ### Framework versions