Training in progress, step 600

Files changed (7) hide show

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.6397
 ## Model description
@@ -45,7 +45,7 @@ The following hyperparameters were used during training:
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- training_steps: 400
 ### Training results
@@ -67,6 +67,10 @@ The following hyperparameters were used during training:
 | 2.6097        | 0.06  | 350  | 2.9186          |
 | 2.7506        | 0.06  | 375  | 2.8954          |
 | 2.7809        | 0.06  | 400  | 2.8744          |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.5787
 ## Model description
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- training_steps: 500
 ### Training results
 | 2.6097        | 0.06  | 350  | 2.9186          |
 | 2.7506        | 0.06  | 375  | 2.8954          |
 | 2.7809        | 0.06  | 400  | 2.8744          |
+| 2.7346        | 0.07  | 425  | 2.8555          |
+| 2.6997        | 0.07  | 450  | 2.8420          |
+| 2.5839        | 0.08  | 475  | 2.8263          |
+| 2.6435        | 0.08  | 500  | 2.8170          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,11 +19,11 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "gate_proj",
-    "q_proj",
     "down_proj",
     "v_proj",
-    "up_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "down_proj",
     "v_proj",
+    "up_proj",
+    "q_proj",
+    "gate_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e6d327386af8dd3a80022ce2fa3f20fd42369186e4eec0a0767d77155d89a6f8
 size 281061608

 version https://git-lfs.github.com/spec/v1
+oid sha256:c01d3a5c152de579100ba46c2276f6b35a069eede3c338454e212fb90626349b
 size 281061608

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7fd35ac5321ad038e7cff8e1c55c7d10ee844150e12c542cfc5ccd511bad66e9
 size 4943162336

 version https://git-lfs.github.com/spec/v1
+oid sha256:709b919bc3b82cb1d6e3eae09cc078ceaecff3a9525bec7a9e1be7d05dfa2aa1
 size 4943162336

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5a4793f545a56ed0d434c1cc693cb1d5847b8566a67742fba90c7de0e9560c45
 size 4999819336

 version https://git-lfs.github.com/spec/v1
+oid sha256:e379001b24f6016fb41fd5b892a6d072a98033abffe4e0e18e22a9768185b4a8
 size 4999819336

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2f4f6aad47d4bb287ba752d651ee2bf663099aaa50ac1e1f6a51d5216216529e
 size 4540516344

 version https://git-lfs.github.com/spec/v1
+oid sha256:285b007f4e21cd8c15a5619d9dfc9e1858c8bda0fb2c3811fdcdc3415a87b28e
 size 4540516344

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c67366e38212c49d83e0cdbd19cc4e17424b8c8574dc991c3ac865c346926513
 size 6520

 version https://git-lfs.github.com/spec/v1
+oid sha256:71cf25bba1ccfe5f8363bf6f4862764514ffd098ebbdccd099864b89a8e6ed0b
 size 6520