End of training

Browse files

Files changed (4) hide show

README.md +1 -55
adapter_config.json +4 -4
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,8 +17,6 @@ should probably proofread and complete it, then remove this comment. -->
 # gemma_FT
 This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 2.2294
 ## Model description
@@ -46,63 +44,11 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- training_steps: 50
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 3.0284        | 0.0   | 1    | 3.5564          |
-| 3.3693        | 0.0   | 2    | 3.4912          |
-| 3.2853        | 0.0   | 3    | 3.3325          |
-| 3.0263        | 0.0   | 4    | 3.1522          |
-| 3.2309        | 0.0   | 5    | 2.9678          |
-| 3.0045        | 0.0   | 6    | 2.8090          |
-| 2.3177        | 0.0   | 7    | 2.7007          |
-| 2.7699        | 0.0   | 8    | 2.6393          |
-| 2.593         | 0.0   | 9    | 2.6119          |
-| 2.6395        | 0.01  | 10   | 2.5837          |
-| 2.2016        | 0.01  | 11   | 2.5492          |
-| 2.4271        | 0.01  | 12   | 2.5134          |
-| 2.6307        | 0.01  | 13   | 2.4833          |
-| 2.5509        | 0.01  | 14   | 2.4618          |
-| 2.4261        | 0.01  | 15   | 2.4458          |
-| 2.468         | 0.01  | 16   | 2.4316          |
-| 2.2545        | 0.01  | 17   | 2.4174          |
-| 2.4923        | 0.01  | 18   | 2.4027          |
-| 2.8034        | 0.01  | 19   | 2.3862          |
-| 2.4174        | 0.01  | 20   | 2.3703          |
-| 2.0761        | 0.01  | 21   | 2.3553          |
-| 2.2617        | 0.01  | 22   | 2.3421          |
-| 2.582         | 0.01  | 23   | 2.3314          |
-| 1.9628        | 0.01  | 24   | 2.3239          |
-| 2.4096        | 0.01  | 25   | 2.3175          |
-| 2.1909        | 0.01  | 26   | 2.3110          |
-| 2.3109        | 0.01  | 27   | 2.3045          |
-| 2.4591        | 0.01  | 28   | 2.2984          |
-| 2.2765        | 0.02  | 29   | 2.2930          |
-| 2.0513        | 0.02  | 30   | 2.2875          |
-| 2.0974        | 0.02  | 31   | 2.2813          |
-| 2.5818        | 0.02  | 32   | 2.2755          |
-| 2.3872        | 0.02  | 33   | 2.2701          |
-| 2.1044        | 0.02  | 34   | 2.2654          |
-| 2.322         | 0.02  | 35   | 2.2611          |
-| 2.1259        | 0.02  | 36   | 2.2573          |
-| 2.3434        | 0.02  | 37   | 2.2538          |
-| 2.2724        | 0.02  | 38   | 2.2506          |
-| 2.3195        | 0.02  | 39   | 2.2476          |
-| 2.458         | 0.02  | 40   | 2.2448          |
-| 2.2682        | 0.02  | 41   | 2.2423          |
-| 2.0698        | 0.02  | 42   | 2.2397          |
-| 1.803         | 0.02  | 43   | 2.2374          |
-| 2.1955        | 0.02  | 44   | 2.2354          |
-| 1.9671        | 0.02  | 45   | 2.2336          |
-| 1.6402        | 0.02  | 46   | 2.2322          |
-| 1.9659        | 0.02  | 47   | 2.2312          |
-| 2.4895        | 0.03  | 48   | 2.2303          |
-| 2.2568        | 0.03  | 49   | 2.2297          |
-| 1.9725        | 0.03  | 50   | 2.2294          |
 ### Framework versions

 # gemma_FT
 This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on the None dataset.
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- training_steps: 12
 - mixed_precision_training: Native AMP
 ### Training results
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,13 +19,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "k_proj",
-    "up_proj",
-    "down_proj",
     "o_proj",
     "q_proj",
     "gate_proj",
-    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "o_proj",
     "q_proj",
+    "k_proj",
     "gate_proj",
+    "up_proj",
+    "v_proj",
+    "down_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:283ab3b7e827118ecf0a6e4fa02254b3e1827acc7c141997e462579ed9af7a2b
 size 29450584

 version https://git-lfs.github.com/spec/v1
+oid sha256:5e8edcfe2a8de2682dacf0a9f10557569be7affe05ac359928a869b5a3063cf3
 size 29450584

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a758d8ba25c59ca7a5fd2936e4ad4bbb79770f68f89d741ace17befcba871356
 size 4856

 version https://git-lfs.github.com/spec/v1
+oid sha256:63d52faaacf58905f69df60603490dddb079e07179df233f9e34b467a3b80126
 size 4856