anupk/askPauladapter

Browse files

Files changed (4) hide show

README.md +27 -5
adapter_config.json +4 -4
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.6059
 ## Model description
@@ -41,16 +41,38 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
-- num_epochs: 3
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 2.3669        | 1.0   | 326  | 2.7544          |
-| 1.2093        | 2.0   | 652  | 3.1944          |
-| 1.021         | 3.0   | 978  | 3.6059          |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.3633
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
+- num_epochs: 25
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.0161        | 1.0   | 326  | 2.6890          |
+| 1.3622        | 2.0   | 652  | 2.9713          |
+| 1.0421        | 3.0   | 978  | 3.1679          |
+| 0.7533        | 4.0   | 1304 | 3.5433          |
+| 0.5052        | 5.0   | 1630 | 3.5190          |
+| 0.4463        | 6.0   | 1956 | 3.8213          |
+| 0.2102        | 7.0   | 2282 | 3.8646          |
+| 0.3904        | 8.0   | 2608 | 3.9794          |
+| 0.1932        | 9.0   | 2934 | 3.9933          |
+| 0.3121        | 10.0  | 3260 | 4.2430          |
+| 0.1707        | 11.0  | 3586 | 4.3414          |
+| 0.3446        | 12.0  | 3912 | 5.0113          |
+| 0.1671        | 13.0  | 4238 | 4.5196          |
+| 0.1743        | 14.0  | 4564 | 4.4975          |
+| 0.4551        | 15.0  | 4890 | 4.2461          |
+| 0.1981        | 16.0  | 5216 | 4.9300          |
+| 0.2151        | 17.0  | 5542 | 4.8182          |
+| 0.1077        | 18.0  | 5868 | 4.6348          |
+| 0.2005        | 19.0  | 6194 | 5.1244          |
+| 0.1163        | 20.0  | 6520 | 4.6448          |
+| 0.0731        | 21.0  | 6846 | 4.8622          |
+| 0.0849        | 22.0  | 7172 | 4.8057          |
+| 0.1515        | 23.0  | 7498 | 4.9841          |
+| 0.2565        | 24.0  | 7824 | 5.3223          |
+| 0.2449        | 25.0  | 8150 | 4.3633          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -22,13 +22,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "k_proj",
     "down_proj",
-    "v_proj",
     "q_proj",
     "up_proj",
-    "gate_proj",
-    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "down_proj",
+    "gate_proj",
     "q_proj",
     "up_proj",
+    "o_proj",
+    "k_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:54ec66c363c19bd603a65163df4bbbd3394f6ae94df706f96634a29b75d40e24
 size 3817075560

 version https://git-lfs.github.com/spec/v1
+oid sha256:b6a2ae0adb81d32a5cecca68c3ff3ae18b9e1f27df002f0c66efe690f339ccb7
 size 3817075560

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:586c07ad390925e9c210df1540ab7e08ce38366d1d37e39ca6e04866c4363b1e
 size 4283

 version https://git-lfs.github.com/spec/v1
+oid sha256:c666da4ebe994f7a208ee1a878a7481eaef585917b21ccec62afcd74678f4d85
 size 4283