raulgdp
/

Meta-Llama3.2-3B-Instruct-009

Generated from Trainer

Model card Files Files and versions

raulgdp commited on Apr 15, 2025

Commit

9afed83

·

verified ·

1 Parent(s): c13888b

End of training

Files changed (1) hide show

README.md +16 -7

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7454
 ## Model description
@@ -36,21 +36,30 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 8
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 0.8831        | 1.0   | 165  | 0.7454          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4892
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.8207        | 1.0    | 83   | 0.7515          |
+| 0.7171        | 2.0    | 166  | 0.6729          |
+| 0.6519        | 3.0    | 249  | 0.6164          |
+| 0.6042        | 4.0    | 332  | 0.5788          |
+| 0.5645        | 5.0    | 415  | 0.5496          |
+| 0.5229        | 6.0    | 498  | 0.5265          |
+| 0.5125        | 7.0    | 581  | 0.5096          |
+| 0.4959        | 8.0    | 664  | 0.4981          |
+| 0.5099        | 9.0    | 747  | 0.4913          |
+| 0.5066        | 9.8848 | 820  | 0.4892          |
 ### Framework versions