End of training

Browse files

Files changed (3) hide show

README.md +34 -21
adapter_model.bin +1 -1
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -54,9 +54,9 @@ gradient_checkpointing: false
 group_by_length: true
 hub_model_id: baby-dev/test-09-01
 hub_repo: null
-hub_strategy: end
 hub_token: null
-learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: false
 local_rank: null
@@ -67,7 +67,7 @@ lora_fan_in_fan_out: null
 lora_model_dir: null
 lora_r: 32
 lora_target_linear: true
-lr_scheduler: constant
 max_grad_norm: 1.0
 max_memory:
   0: 75GB
@@ -113,7 +113,7 @@ xformers_attention: null
 This model is a fine-tuned version of [peft-internal-testing/tiny-dummy-qwen2](https://huggingface.co/peft-internal-testing/tiny-dummy-qwen2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 11.8980
 ## Model description
@@ -132,14 +132,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0002
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
-- lr_scheduler_type: constant
 - lr_scheduler_warmup_steps: 50
 - training_steps: 6007
@@ -148,21 +148,34 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
 | No log        | 0.0083  | 1    | 11.9304         |
-| 12.0987       | 1.2474  | 150  | 11.9076         |
-| 11.9124       | 2.4948  | 300  | 11.9027         |
-| 11.9043       | 3.7422  | 450  | 11.9013         |
-| 11.8992       | 4.9896  | 600  | 11.9006         |
-| 12.0835       | 6.2370  | 750  | 11.8999         |
-| 11.9004       | 7.4844  | 900  | 11.8995         |
-| 11.8977       | 8.7318  | 1050 | 11.8993         |
-| 11.9026       | 9.9792  | 1200 | 11.8991         |
-| 12.0817       | 11.2266 | 1350 | 11.8989         |
-| 11.8973       | 12.4740 | 1500 | 11.8987         |
-| 11.8949       | 13.7214 | 1650 | 11.8983         |
-| 11.8948       | 14.9688 | 1800 | 11.8980         |
-| 12.0731       | 16.2162 | 1950 | 11.8979         |
-| 11.8973       | 17.4636 | 2100 | 11.8981         |
-| 11.9018       | 18.7110 | 2250 | 11.8980         |
 ### Framework versions

 group_by_length: true
 hub_model_id: baby-dev/test-09-01
 hub_repo: null
+hub_strategy: checkpoint
 hub_token: null
+learning_rate: 0.0001
 load_in_4bit: false
 load_in_8bit: false
 local_rank: null
 lora_model_dir: null
 lora_r: 32
 lora_target_linear: true
+lr_scheduler: linear
 max_grad_norm: 1.0
 max_memory:
   0: 75GB
 This model is a fine-tuned version of [peft-internal-testing/tiny-dummy-qwen2](https://huggingface.co/peft-internal-testing/tiny-dummy-qwen2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 11.8994
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
+- lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 50
 - training_steps: 6007
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
 | No log        | 0.0083  | 1    | 11.9304         |
+| 12.1074       | 1.2474  | 150  | 11.9141         |
+| 11.917        | 2.4948  | 300  | 11.9077         |
+| 11.9081       | 3.7422  | 450  | 11.9052         |
+| 11.9026       | 4.9896  | 600  | 11.9038         |
+| 12.0859       | 6.2370  | 750  | 11.9026         |
+| 11.9028       | 7.4844  | 900  | 11.9019         |
+| 11.8998       | 8.7318  | 1050 | 11.9016         |
+| 11.9048       | 9.9792  | 1200 | 11.9015         |
+| 12.084        | 11.2266 | 1350 | 11.9014         |
+| 11.8994       | 12.4740 | 1500 | 11.9011         |
+| 11.8969       | 13.7214 | 1650 | 11.9008         |
+| 11.8969       | 14.9688 | 1800 | 11.9005         |
+| 12.0752       | 16.2162 | 1950 | 11.9004         |
+| 11.8995       | 17.4636 | 2100 | 11.9006         |
+| 11.9041       | 18.7110 | 2250 | 11.9004         |
+| 11.9008       | 19.9584 | 2400 | 11.9004         |
+| 12.0829       | 21.2058 | 2550 | 11.9002         |
+| 11.9013       | 22.4532 | 2700 | 11.8999         |
+| 11.9025       | 23.7006 | 2850 | 11.8999         |
+| 11.8988       | 24.9480 | 3000 | 11.8996         |
+| 12.0787       | 26.1954 | 3150 | 11.8996         |
+| 11.8966       | 27.4428 | 3300 | 11.8996         |
+| 11.8997       | 28.6902 | 3450 | 11.8996         |
+| 11.9017       | 29.9376 | 3600 | 11.8995         |
+| 12.0742       | 31.1850 | 3750 | 11.8995         |
+| 11.8992       | 32.4324 | 3900 | 11.8992         |
+| 11.9043       | 33.6798 | 4050 | 11.8994         |
+| 11.895        | 34.9272 | 4200 | 11.8994         |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:165aa5a7fe839e33187b0a24780193720ab54c01ee9cd9e7439bd62b11559d96
 size 55170

 version https://git-lfs.github.com/spec/v1
+oid sha256:b4508921f2ccb5fd88d4646b57d46f9749b5d720c519b7e10c595067c7e6ded1
 size 55170

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:104d1f0f47cb1d033a95cc0751e04d58f04bd2a8195e48551100e3a832b5bf10
 size 48552

 version https://git-lfs.github.com/spec/v1
+oid sha256:24c671988f484779d1bc65950834eaef9e98c12954bf650fea29c88a72d70f6b
 size 48552