rbelanec
/

test

@@ -17,10 +17,10 @@ should probably proofread and complete it, then remove this comment. -->
 # test
-This model is a fine-tuned version of [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on the wsc dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3508
-- Num Input Tokens Seen: 43904
 ## Model description
@@ -43,7 +43,7 @@ The following hyperparameters were used during training:
 - train_batch_size: 2
 - eval_batch_size: 2
 - seed: 123
-- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 1
@@ -52,31 +52,31 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
 |:-------------:|:------:|:----:|:---------------:|:-----------------:|
-| 0.7661        | 0.0522 | 13   | 0.6882          | 2288              |
-| 0.6839        | 0.1044 | 26   | 0.4648          | 4656              |
-| 0.374         | 0.1566 | 39   | 0.3842          | 6944              |
-| 0.3624        | 0.2088 | 52   | 0.3785          | 9232              |
-| 0.3164        | 0.2610 | 65   | 0.3669          | 11424             |
-| 0.3623        | 0.3133 | 78   | 0.3628          | 13760             |
-| 0.3656        | 0.3655 | 91   | 0.3581          | 16048             |
-| 0.2954        | 0.4177 | 104  | 0.3806          | 18272             |
-| 0.4359        | 0.4699 | 117  | 0.3704          | 20656             |
-| 0.356         | 0.5221 | 130  | 0.3525          | 23056             |
-| 0.3685        | 0.5743 | 143  | 0.3546          | 25312             |
-| 0.3832        | 0.6265 | 156  | 0.3515          | 27552             |
-| 0.3202        | 0.6787 | 169  | 0.3524          | 29984             |
-| 0.3678        | 0.7309 | 182  | 0.3511          | 32080             |
-| 0.3704        | 0.7831 | 195  | 0.3565          | 34176             |
-| 0.3651        | 0.8353 | 208  | 0.3508          | 36512             |
-| 0.3666        | 0.8876 | 221  | 0.3531          | 38912             |
-| 0.3489        | 0.9398 | 234  | 0.3516          | 41120             |
-| 0.3405        | 0.9920 | 247  | 0.3515          | 43600             |
 ### Framework versions
 - PEFT 0.17.1
 - Transformers 4.51.3
-- Pytorch 2.9.1+cu128
 - Datasets 4.0.0
 - Tokenizers 0.21.4

 # test
+This model is a fine-tuned version of [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3559
+- Num Input Tokens Seen: 43600
 ## Model description
 - train_batch_size: 2
 - eval_batch_size: 2
 - seed: 123
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 1
 | Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
 |:-------------:|:------:|:----:|:---------------:|:-----------------:|
+| 0.7689        | 0.0522 | 13   | 0.6838          | 2288              |
+| 0.6557        | 0.1044 | 26   | 0.4604          | 4656              |
+| 0.3647        | 0.1566 | 39   | 0.3835          | 6944              |
+| 0.3506        | 0.2088 | 52   | 0.3836          | 9232              |
+| 0.3084        | 0.2610 | 65   | 0.3691          | 11424             |
+| 0.3649        | 0.3133 | 78   | 0.3669          | 13760             |
+| 0.3612        | 0.3655 | 91   | 0.3621          | 16048             |
+| 0.2896        | 0.4177 | 104  | 0.3752          | 18272             |
+| 0.4278        | 0.4699 | 117  | 0.3691          | 20656             |
+| 0.3591        | 0.5221 | 130  | 0.3583          | 23056             |
+| 0.3726        | 0.5743 | 143  | 0.3531          | 25312             |
+| 0.3829        | 0.6265 | 156  | 0.3520          | 27552             |
+| 0.3318        | 0.6787 | 169  | 0.3502          | 29984             |
+| 0.3655        | 0.7309 | 182  | 0.3543          | 32080             |
+| 0.3703        | 0.7831 | 195  | 0.3526          | 34176             |
+| 0.3585        | 0.8353 | 208  | 0.3535          | 36512             |
+| 0.3626        | 0.8876 | 221  | 0.3517          | 38912             |
+| 0.3419        | 0.9398 | 234  | 0.3497          | 41120             |
+| 0.3311        | 0.9920 | 247  | 0.3559          | 43600             |
 ### Framework versions
 - PEFT 0.17.1
 - Transformers 4.51.3
+- Pytorch 2.10.0+cu128
 - Datasets 4.0.0
 - Tokenizers 0.21.4

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7598b118796b1a30bbf291b15efef6a1ddf11a79a29338c929248412495fa19c
 size 335717200

 version https://git-lfs.github.com/spec/v1
+oid sha256:0e962c186a051742786c79d3b1c1078f44102b5980e7972415ae55fedac6ce56
 size 335717200