llama2-13B-supervised-ft-5-epochs-351

Browse files

Files changed (4) hide show

README.md +25 -12
adapter_config.json +4 -4
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.1290
 ## Model description
@@ -40,25 +40,38 @@ The following hyperparameters were used during training:
 - train_batch_size: 1
 - eval_batch_size: 1
 - seed: 42
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 2.3285        | 0.12  | 4    | 2.1680          |
-| 2.2299        | 0.24  | 8    | 2.1595          |
-| 2.0283        | 0.35  | 12   | 2.1514          |
-| 2.3197        | 0.47  | 16   | 2.1442          |
-| 2.1502        | 0.59  | 20   | 2.1382          |
-| 2.1846        | 0.71  | 24   | 2.1336          |
-| 2.2649        | 0.82  | 28   | 2.1306          |
-| 2.252         | 0.94  | 32   | 2.1290          |
 ### Framework versions

 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.0428
 ## Model description
 - train_batch_size: 1
 - eval_batch_size: 1
 - seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.2017        | 0.24  | 4    | 2.1673          |
+| 2.3534        | 0.47  | 8    | 2.1569          |
+| 2.1913        | 0.71  | 12   | 2.1461          |
+| 2.1365        | 0.94  | 16   | 2.1355          |
+| 2.2305        | 1.18  | 20   | 2.1253          |
+| 2.2103        | 1.41  | 24   | 2.1157          |
+| 2.077         | 1.65  | 28   | 2.1067          |
+| 2.0825        | 1.88  | 32   | 2.0981          |
+| 2.0967        | 2.12  | 36   | 2.0901          |
+| 2.1279        | 2.35  | 40   | 2.0826          |
+| 1.9441        | 2.59  | 44   | 2.0758          |
+| 2.1963        | 2.82  | 48   | 2.0697          |
+| 2.0088        | 3.06  | 52   | 2.0641          |
+| 2.1253        | 3.29  | 56   | 2.0592          |
+| 2.1337        | 3.53  | 60   | 2.0550          |
+| 1.9641        | 3.76  | 64   | 2.0513          |
+| 1.9613        | 4.0   | 68   | 2.0484          |
+| 2.0841        | 4.24  | 72   | 2.0462          |
+| 2.2068        | 4.47  | 76   | 2.0444          |
+| 2.0153        | 4.71  | 80   | 2.0433          |
+| 2.1073        | 4.94  | 84   | 2.0428          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,13 +19,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "o_proj",
-    "up_proj",
     "v_proj",
     "gate_proj",
-    "k_proj",
-    "q_proj",
-    "down_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "down_proj",
     "o_proj",
+    "q_proj",
     "v_proj",
+    "up_proj",
     "gate_proj",
+    "k_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5ca6b86293e72e448256ba721c6cb6d98e35653723bb8c580906d1823fe13d7e
 size 1001465824

 version https://git-lfs.github.com/spec/v1
+oid sha256:579f27d35002759749365c8a4776e0d8daaf7e257d4d0cd6309278e831f5f9c1
 size 1001465824

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:665d0f5d76ff011650e9b363c0eeb92bde61fffd61ab1cd09a959010d6ff6457
 size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:b0c336f5e655ce62ff8336be780b0a793f61909a8508df70ce6e21589c4cfc67
 size 4664