baby-dev
/

test-default1

Generated from Trainer

Model card Files Files and versions

baby-dev commited on Jan 31, 2025

Commit

4c04564

·

verified ·

1 Parent(s): 9731301

End of training

Files changed (2) hide show

README.md +8 -7
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -41,7 +41,7 @@ debug: null
 deepspeed: null
 device_map: auto
 do_eval: true
-early_stopping_patience: 5
 eval_batch_size: 4
 eval_max_new_tokens: 128
 eval_steps: 20
@@ -88,7 +88,8 @@ pad_to_sequence_len: true
 resume_from_checkpoint: null
 s2_attention: null
 sample_packing: false
-save_steps: 20
 saves_per_epoch: null
 sequence_len: 512
 strict: false
@@ -115,7 +116,7 @@ xformers_attention: null
 This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.1632
 ## Model description
@@ -150,10 +151,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 2.4891        | 0.0004 | 1    | 2.7103          |
-| 2.3208        | 0.0085 | 20   | 2.3450          |
-| 2.1896        | 0.0169 | 40   | 2.2134          |
-| 2.1554        | 0.0254 | 60   | 2.1695          |
-| 2.266         | 0.0339 | 80   | 2.1632          |
 ### Framework versions

 deepspeed: null
 device_map: auto
 do_eval: true
+# early_stopping_patience: 5
 eval_batch_size: 4
 eval_max_new_tokens: 128
 eval_steps: 20
 resume_from_checkpoint: null
 s2_attention: null
 sample_packing: false
+# save_steps: 20
+save_strategy: 'no'
 saves_per_epoch: null
 sequence_len: 512
 strict: false
 This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.1624
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 2.4891        | 0.0004 | 1    | 2.7103          |
+| 2.3211        | 0.0085 | 20   | 2.3435          |
+| 2.1925        | 0.0169 | 40   | 2.2119          |
+| 2.1566        | 0.0254 | 60   | 2.1687          |
+| 2.2616        | 0.0339 | 80   | 2.1624          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:81e043f9582095aa268d39697c274421d7ab862d110fb59d8c5bebca80290330
 size 70506570

 version https://git-lfs.github.com/spec/v1
+oid sha256:22bb3a87826d4d2cb7a61315cfee8b0249768feb50ec86ff9b53266eeeca7a2f
 size 70506570