End of training
Browse files- README.md +8 -7
- adapter_model.bin +1 -1
README.md
CHANGED
|
@@ -41,7 +41,7 @@ debug: null
|
|
| 41 |
deepspeed: null
|
| 42 |
device_map: auto
|
| 43 |
do_eval: true
|
| 44 |
-
early_stopping_patience: 5
|
| 45 |
eval_batch_size: 4
|
| 46 |
eval_max_new_tokens: 128
|
| 47 |
eval_steps: 20
|
|
@@ -88,7 +88,8 @@ pad_to_sequence_len: true
|
|
| 88 |
resume_from_checkpoint: null
|
| 89 |
s2_attention: null
|
| 90 |
sample_packing: false
|
| 91 |
-
save_steps: 20
|
|
|
|
| 92 |
saves_per_epoch: null
|
| 93 |
sequence_len: 512
|
| 94 |
strict: false
|
|
@@ -115,7 +116,7 @@ xformers_attention: null
|
|
| 115 |
|
| 116 |
This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the None dataset.
|
| 117 |
It achieves the following results on the evaluation set:
|
| 118 |
-
- Loss: 2.
|
| 119 |
|
| 120 |
## Model description
|
| 121 |
|
|
@@ -150,10 +151,10 @@ The following hyperparameters were used during training:
|
|
| 150 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 151 |
|:-------------:|:------:|:----:|:---------------:|
|
| 152 |
| 2.4891 | 0.0004 | 1 | 2.7103 |
|
| 153 |
-
| 2.
|
| 154 |
-
| 2.
|
| 155 |
-
| 2.
|
| 156 |
-
| 2.
|
| 157 |
|
| 158 |
|
| 159 |
### Framework versions
|
|
|
|
| 41 |
deepspeed: null
|
| 42 |
device_map: auto
|
| 43 |
do_eval: true
|
| 44 |
+
# early_stopping_patience: 5
|
| 45 |
eval_batch_size: 4
|
| 46 |
eval_max_new_tokens: 128
|
| 47 |
eval_steps: 20
|
|
|
|
| 88 |
resume_from_checkpoint: null
|
| 89 |
s2_attention: null
|
| 90 |
sample_packing: false
|
| 91 |
+
# save_steps: 20
|
| 92 |
+
save_strategy: 'no'
|
| 93 |
saves_per_epoch: null
|
| 94 |
sequence_len: 512
|
| 95 |
strict: false
|
|
|
|
| 116 |
|
| 117 |
This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the None dataset.
|
| 118 |
It achieves the following results on the evaluation set:
|
| 119 |
+
- Loss: 2.1624
|
| 120 |
|
| 121 |
## Model description
|
| 122 |
|
|
|
|
| 151 |
| Training Loss | Epoch | Step | Validation Loss |
|
| 152 |
|:-------------:|:------:|:----:|:---------------:|
|
| 153 |
| 2.4891 | 0.0004 | 1 | 2.7103 |
|
| 154 |
+
| 2.3211 | 0.0085 | 20 | 2.3435 |
|
| 155 |
+
| 2.1925 | 0.0169 | 40 | 2.2119 |
|
| 156 |
+
| 2.1566 | 0.0254 | 60 | 2.1687 |
|
| 157 |
+
| 2.2616 | 0.0339 | 80 | 2.1624 |
|
| 158 |
|
| 159 |
|
| 160 |
### Framework versions
|
adapter_model.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 70506570
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:22bb3a87826d4d2cb7a61315cfee8b0249768feb50ec86ff9b53266eeeca7a2f
|
| 3 |
size 70506570
|