End of training

Browse files

Files changed (3) hide show

README.md +16 -19
adapter_model.bin +1 -1
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -28,11 +28,11 @@ datasets:
   type: alpaca
 debug: null
 deepspeed: null
-early_stopping_patience: null
 eval_max_new_tokens: 128
 eval_sample_packing: false
 eval_table_size: null
-evals_per_epoch: 4
 flash_attention: true
 fp16: null
 fsdp: null
@@ -65,8 +65,8 @@ output_dir: miner_id_besimray
 pad_to_sequence_len: false
 resume_from_checkpoint: null
 s2_attention: null
-sample_packing: true
-save_steps: 5
 save_strategy: steps
 sequence_len: 4096
 strict: false
@@ -80,7 +80,7 @@ wandb_project: Public_TuningSN
 wandb_run: miner_id_24
 wandb_runid: 383a850e-bb15-45a2-8f4b-fc96eb001a74
 warmup_steps: 10
-weight_decay: 0.0
 xformers_attention: null
 ```
@@ -91,7 +91,7 @@ xformers_attention: null
 This model is a fine-tuned version of [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.1848
 ## Model description
@@ -119,22 +119,19 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
-- training_steps: 10
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 1.2878        | 0.5   | 1    | 1.2576          |
-| 1.294         | 1.0   | 2    | 1.2571          |
-| 1.2719        | 1.375 | 3    | 1.2468          |
-| 1.2869        | 1.875 | 4    | 1.2302          |
-| 1.2828        | 2.25  | 5    | 1.2147          |
-| 1.2449        | 2.75  | 6    | 1.2145          |
-| 1.2385        | 3.125 | 7    | 1.2129          |
-| 1.2142        | 3.625 | 8    | 1.2061          |
-| 1.2725        | 4.125 | 9    | 1.1927          |
-| 1.2282        | 4.5   | 10   | 1.1848          |
 ### Framework versions

   type: alpaca
 debug: null
 deepspeed: null
+early_stopping_patience: 3
 eval_max_new_tokens: 128
 eval_sample_packing: false
+eval_steps: 20
 eval_table_size: null
 flash_attention: true
 fp16: null
 fsdp: null
 pad_to_sequence_len: false
 resume_from_checkpoint: null
 s2_attention: null
+sample_packing: false
+save_steps: 20
 save_strategy: steps
 sequence_len: 4096
 strict: false
 wandb_run: miner_id_24
 wandb_runid: 383a850e-bb15-45a2-8f4b-fc96eb001a74
 warmup_steps: 10
+weight_decay: 0.01
 xformers_attention: null
 ```
 This model is a fine-tuned version of [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1679
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
+- training_steps: 150
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.3028        | 0.0211 | 1    | 1.2579          |
+| 1.3521        | 0.4211 | 20   | 1.1702          |
+| 1.1977        | 0.8421 | 40   | 1.1533          |
+| 1.099         | 1.2632 | 60   | 1.1519          |
+| 1.0658        | 1.6842 | 80   | 1.1523          |
+| 1.0091        | 2.1053 | 100  | 1.1575          |
+| 1.1045        | 2.5263 | 120  | 1.1679          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:663e8219297d65bc184eb8f4abcf0507f9efed03f9d35d5b9fc1a1f5ffdd8295
 size 45169354

 version https://git-lfs.github.com/spec/v1
+oid sha256:06f5758c8cb21c14363a1d12df2489b3350d89721ddf69a0a83c34d8d1b99ba6
 size 45169354

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e36bb4966b3713f17079f0f0073225f3c17789e78598436f125bc5847c546220
 size 45118424

 version https://git-lfs.github.com/spec/v1
+oid sha256:87f7d1cab1f1c1f94445cd22369257aaa21529875106c6c56c58b5307c1bc477
 size 45118424