Model save

Files changed (2) hide show

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 10.4345
 ## Model description
@@ -39,23 +39,27 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.002
-- train_batch_size: 32
-- eval_batch_size: 32
 - seed: 1652
 - gradient_accumulation_steps: 5
-- total_train_batch_size: 160
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.2
-- num_epochs: 2
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 62.6423       | 1.0   | 401  | 11.5361         |
-| 53.3503       | 2.0   | 802  | 10.4345         |
 ### Framework versions

 This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 9.6934
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.002
+- train_batch_size: 64
+- eval_batch_size: 64
 - seed: 1652
 - gradient_accumulation_steps: 5
+- total_train_batch_size: 320
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.2
+- num_epochs: 6
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 213.2576      | 1.0   | 201  | 22.5111         |
+| 61.0713       | 2.0   | 402  | 11.1953         |
+| 53.5773       | 3.0   | 603  | 10.4559         |
+| 51.1444       | 4.0   | 804  | 10.0289         |
+| 49.6945       | 5.0   | 1005 | 9.7750          |
+| 48.9875       | 6.0   | 1206 | 9.6934          |
 ### Framework versions

events.out.tfevents.1753992692.bdaec93ae06a.418.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:abba0c17ae395e1c98cc1251b3f20b9da02db707c87923b5534bc3bb96956601
+size 359