End of training

Browse files

Files changed (9) hide show

README.md +53 -13
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
runs/Nov26_02-14-09_localhost/events.out.tfevents.1732567452.localhost +2 -2
runs/Nov27_12-37-02_localhost/events.out.tfevents.1732691225.localhost +3 -0
runs/Nov27_12-42-53_localhost/events.out.tfevents.1732691574.localhost +3 -0
runs/Nov27_12-50-42_localhost/events.out.tfevents.1732692043.localhost +3 -0
runs/Nov27_12-52-19_localhost/events.out.tfevents.1732692140.localhost +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [bigcode/starcoderbase-1b](https://huggingface.co/bigcode/starcoderbase-1b) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.2169
 ## Model description
@@ -43,23 +43,63 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 50
-- training_steps: 5000
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 0.8313        | 0.9960 | 500  | 1.2189          |
-| 0.5881        | 1.9920 | 1000 | 1.2062          |
-| 0.5283        | 2.9880 | 1500 | 1.2020          |
-| 0.4928        | 3.9841 | 2000 | 1.1926          |
-| 0.4682        | 4.9801 | 2500 | 1.2326          |
-| 0.4481        | 5.9761 | 3000 | 1.2139          |
-| 0.4383        | 6.9721 | 3500 | 1.2099          |
-| 0.4305        | 7.9681 | 4000 | 1.2161          |
-| 0.4251        | 8.9641 | 4500 | 1.2138          |
-| 0.423         | 9.9602 | 5000 | 1.2169          |
 ### Framework versions

 This model is a fine-tuned version of [bigcode/starcoderbase-1b](https://huggingface.co/bigcode/starcoderbase-1b) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.0382
 ## Model description
 - total_train_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 10
+- training_steps: 500
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.276         | 0.0416 | 10   | 1.0832          |
+| 1.2151        | 0.0833 | 20   | 1.0789          |
+| 1.2213        | 0.1249 | 30   | 1.0736          |
+| 1.1694        | 0.1666 | 40   | 1.0718          |
+| 1.2627        | 0.2082 | 50   | 1.0696          |
+| 1.1801        | 0.2499 | 60   | 1.0690          |
+| 1.1013        | 0.2915 | 70   | 1.0637          |
+| 1.082         | 0.3332 | 80   | 1.0643          |
+| 1.0783        | 0.3748 | 90   | 1.0638          |
+| 1.1166        | 0.4164 | 100  | 1.0615          |
+| 1.1207        | 0.4581 | 110  | 1.0581          |
+| 1.197         | 0.4997 | 120  | 1.0562          |
+| 1.012         | 0.5414 | 130  | 1.0583          |
+| 1.1291        | 0.5830 | 140  | 1.0515          |
+| 1.0695        | 0.6247 | 150  | 1.0520          |
+| 1.0924        | 0.6663 | 160  | 1.0514          |
+| 1.1287        | 0.7080 | 170  | 1.0536          |
+| 1.0514        | 0.7496 | 180  | 1.0508          |
+| 1.1101        | 0.7913 | 190  | 1.0491          |
+| 1.1474        | 0.8329 | 200  | 1.0489          |
+| 1.1451        | 0.8745 | 210  | 1.0476          |
+| 1.1688        | 0.9162 | 220  | 1.0434          |
+| 1.053         | 0.9578 | 230  | 1.0447          |
+| 1.0146        | 0.9995 | 240  | 1.0438          |
+| 1.1127        | 1.0411 | 250  | 1.0442          |
+| 0.9734        | 1.0828 | 260  | 1.0420          |
+| 1.0315        | 1.1244 | 270  | 1.0445          |
+| 1.0803        | 1.1661 | 280  | 1.0435          |
+| 1.0892        | 1.2077 | 290  | 1.0440          |
+| 1.0191        | 1.2493 | 300  | 1.0427          |
+| 1.034         | 1.2910 | 310  | 1.0416          |
+| 1.1136        | 1.3326 | 320  | 1.0413          |
+| 0.9837        | 1.3743 | 330  | 1.0413          |
+| 1.0659        | 1.4159 | 340  | 1.0405          |
+| 0.9931        | 1.4576 | 350  | 1.0409          |
+| 1.1141        | 1.4992 | 360  | 1.0403          |
+| 1.0851        | 1.5409 | 370  | 1.0399          |
+| 1.053         | 1.5825 | 380  | 1.0390          |
+| 1.0652        | 1.6242 | 390  | 1.0395          |
+| 1.0998        | 1.6658 | 400  | 1.0396          |
+| 0.9909        | 1.7074 | 410  | 1.0390          |
+| 1.0946        | 1.7491 | 420  | 1.0386          |
+| 1.0471        | 1.7907 | 430  | 1.0382          |
+| 0.9719        | 1.8324 | 440  | 1.0382          |
+| 1.0641        | 1.8740 | 450  | 1.0382          |
+| 1.0003        | 1.9157 | 460  | 1.0383          |
+| 1.0128        | 1.9573 | 470  | 1.0383          |
+| 1.0637        | 1.9990 | 480  | 1.0384          |
+| 1.0583        | 2.0406 | 490  | 1.0383          |
+| 0.991         | 2.0822 | 500  | 1.0382          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -21,9 +21,9 @@
   "revision": null,
   "target_modules": [
     "c_fc",
-    "c_attn",
     "c_proj",
-    "q_attn"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "revision": null,
   "target_modules": [
     "c_fc",
     "c_proj",
+    "q_attn",
+    "c_attn"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:91cba88a0da21472c5eb0385074a1e267356d4b39c8b0b03006a2c877b68bc6d
 size 22241240

 version https://git-lfs.github.com/spec/v1
+oid sha256:837770b5d26de865ba3ad82a19807fb5536705078b70b52236b3f25ddffe6e83
 size 22241240

runs/Nov26_02-14-09_localhost/events.out.tfevents.1732567452.localhost CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:52407103f205483fdf1f5e7e1d011cc11536eadd9e21e7003f7e91ce1df36b9f
-size 8298

 version https://git-lfs.github.com/spec/v1
+oid sha256:b7e19b69f7b1d0e9d5f628040bdfcbacf8be72339e7a5411e1ea1a257c19a76c
+size 8780

runs/Nov27_12-37-02_localhost/events.out.tfevents.1732691225.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a933309d9839a4d8a976185923886612011774faa5b451af48e1a1c127ebeaed
+size 5406

runs/Nov27_12-42-53_localhost/events.out.tfevents.1732691574.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fb3602fe6213a02f944510ae7d4f96547fbcc3ccc3e6a2657136f79713c2be19
+size 5406

runs/Nov27_12-50-42_localhost/events.out.tfevents.1732692043.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a0624d3afbf59f3ddd5c7f617e346d1aeb7bf578b42161860ba0ab0d59a4fc0f
+size 5406

runs/Nov27_12-52-19_localhost/events.out.tfevents.1732692140.localhost ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9eca7b86e36ab5a16731db78118f036a4370b818f94d303b4da451a3eb7a897e
+size 29747

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c92d01671426afa1aff9f9e64c7c347272e643667e4414c0e506c43b648d8a78
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:5119259b8317b8c0f8623eb6d82dfa894dd6c8015698585b12331350bf7b2207
 size 5304