error577 commited on
Commit
ec27d31
·
verified ·
1 Parent(s): e393328

End of training

Browse files
Files changed (2) hide show
  1. README.md +10 -3
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -69,7 +69,8 @@ lora_r: 128
69
  lora_target_linear: true
70
  lr_scheduler: cosine
71
  max_grad_norm: 1.0
72
- max_steps: 300
 
73
  micro_batch_size: 1
74
  mlflow_experiment_name: /tmp/000dac3a8cb81c80_train_data.json
75
  model_type: AutoModelForCausalLM
@@ -106,7 +107,7 @@ xformers_attention: null
106
 
107
  This model is a fine-tuned version of [unsloth/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/unsloth/Qwen2.5-Coder-1.5B-Instruct) on the None dataset.
108
  It achieves the following results on the evaluation set:
109
- - Loss: 0.5562
110
 
111
  ## Model description
112
 
@@ -134,7 +135,7 @@ The following hyperparameters were used during training:
134
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
135
  - lr_scheduler_type: cosine
136
  - lr_scheduler_warmup_steps: 10
137
- - training_steps: 300
138
 
139
  ### Training results
140
 
@@ -147,6 +148,12 @@ The following hyperparameters were used during training:
147
  | 0.507 | 0.0224 | 200 | 0.5608 |
148
  | 0.5901 | 0.0280 | 250 | 0.5569 |
149
  | 0.5583 | 0.0336 | 300 | 0.5562 |
 
 
 
 
 
 
150
 
151
 
152
  ### Framework versions
 
69
  lora_target_linear: true
70
  lr_scheduler: cosine
71
  max_grad_norm: 1.0
72
+ max_steps: 600
73
+ auto_resume_from_checkpoints: true
74
  micro_batch_size: 1
75
  mlflow_experiment_name: /tmp/000dac3a8cb81c80_train_data.json
76
  model_type: AutoModelForCausalLM
 
107
 
108
  This model is a fine-tuned version of [unsloth/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/unsloth/Qwen2.5-Coder-1.5B-Instruct) on the None dataset.
109
  It achieves the following results on the evaluation set:
110
+ - Loss: 0.5509
111
 
112
  ## Model description
113
 
 
135
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
136
  - lr_scheduler_type: cosine
137
  - lr_scheduler_warmup_steps: 10
138
+ - training_steps: 600
139
 
140
  ### Training results
141
 
 
148
  | 0.507 | 0.0224 | 200 | 0.5608 |
149
  | 0.5901 | 0.0280 | 250 | 0.5569 |
150
  | 0.5583 | 0.0336 | 300 | 0.5562 |
151
+ | 0.6019 | 0.0392 | 350 | 0.5567 |
152
+ | 0.5378 | 0.0448 | 400 | 0.5553 |
153
+ | 0.4796 | 0.0504 | 450 | 0.5526 |
154
+ | 0.5788 | 0.0560 | 500 | 0.5517 |
155
+ | 0.5497 | 0.0616 | 550 | 0.5511 |
156
+ | 0.5998 | 0.0672 | 600 | 0.5509 |
157
 
158
 
159
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2df4bee64edf2323611d04f90ff07ca6cd9258b1028dc07f8c8ae1df76e3a7b7
3
  size 591014186
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b087a55fbe8c06647f39ce4ff1f96316857798019ecce7603f7d32036343eab1
3
  size 591014186