baby-dev commited on
Commit
4f0a2a9
·
verified ·
1 Parent(s): c946400

End of training

Browse files
Files changed (2) hide show
  1. README.md +6 -8
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -71,11 +71,11 @@ lr_scheduler: linear
71
  max_grad_norm: 1.0
72
  max_memory:
73
  0: 75GB
74
- max_steps: 1000
75
  micro_batch_size: 4
76
  mlflow_experiment_name: /tmp/00f5ac3cc66d870f_train_data.json
77
  model_type: AutoModelForCausalLM
78
- num_epochs: 3
79
  optim_args:
80
  adam_beta1: 0.9
81
  adam_beta2: 0.95
@@ -112,7 +112,7 @@ xformers_attention: null
112
 
113
  This model is a fine-tuned version of [fxmarty/tiny-dummy-qwen2](https://huggingface.co/fxmarty/tiny-dummy-qwen2) on the None dataset.
114
  It achieves the following results on the evaluation set:
115
- - Loss: 11.9163
116
 
117
  ## Model description
118
 
@@ -140,16 +140,14 @@ The following hyperparameters were used during training:
140
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
141
  - lr_scheduler_type: linear
142
  - lr_scheduler_warmup_steps: 5
143
- - training_steps: 358
144
 
145
  ### Training results
146
 
147
  | Training Loss | Epoch | Step | Validation Loss |
148
  |:-------------:|:------:|:----:|:---------------:|
149
- | 11.9208 | 0.9979 | 119 | 11.9190 |
150
- | 11.9206 | 1.9958 | 238 | 11.9168 |
151
- | 11.916 | 2.9937 | 357 | 11.9166 |
152
- | 11.916 | 3.0021 | 358 | 11.9163 |
153
 
154
 
155
  ### Framework versions
 
71
  max_grad_norm: 1.0
72
  max_memory:
73
  0: 75GB
74
+ max_steps: 200
75
  micro_batch_size: 4
76
  mlflow_experiment_name: /tmp/00f5ac3cc66d870f_train_data.json
77
  model_type: AutoModelForCausalLM
78
+ num_epochs: 100
79
  optim_args:
80
  adam_beta1: 0.9
81
  adam_beta2: 0.95
 
112
 
113
  This model is a fine-tuned version of [fxmarty/tiny-dummy-qwen2](https://huggingface.co/fxmarty/tiny-dummy-qwen2) on the None dataset.
114
  It achieves the following results on the evaluation set:
115
+ - Loss: 11.9184
116
 
117
  ## Model description
118
 
 
140
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
141
  - lr_scheduler_type: linear
142
  - lr_scheduler_warmup_steps: 5
143
+ - training_steps: 200
144
 
145
  ### Training results
146
 
147
  | Training Loss | Epoch | Step | Validation Loss |
148
  |:-------------:|:------:|:----:|:---------------:|
149
+ | 11.9214 | 0.9979 | 119 | 11.9196 |
150
+ | 11.9219 | 1.6771 | 200 | 11.9184 |
 
 
151
 
152
 
153
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:577b1ea719f98665cedf01f643f3b812b9a404b2b2170e6b24314e51db25c742
3
  size 55170
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a51735f763e2d089d3ea67793f9c7868b8c558d2768dfb12bf518086b4d2a481
3
  size 55170