Model save

Browse files

Files changed (4) hide show

README.md +32 -12
config.json +1 -0
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.5861
 ## Model description
@@ -44,17 +44,37 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch  | Step  | Validation Loss |
-|:-------------:|:------:|:-----:|:---------------:|
-| 5.4168        | 0.1067 | 5000  | 5.3265          |
-| 5.0714        | 0.2133 | 10000 | 4.9847          |
-| 4.9362        | 0.32   | 15000 | 4.8556          |
-| 4.8725        | 0.4267 | 20000 | 4.7838          |
-| 4.8284        | 0.5333 | 25000 | 4.7355          |
-| 4.7933        | 0.64   | 30000 | 4.7016          |
-| 4.7692        | 0.7467 | 35000 | 4.6735          |
-| 4.7015        | 0.8533 | 40000 | 4.6264          |
-| 4.6686        | 0.96   | 45000 | 4.5861          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.4840
 ## Model description
 ### Training results
+| Training Loss | Epoch  | Step   | Validation Loss |
+|:-------------:|:------:|:------:|:---------------:|
+| 5.4612        | 0.0337 | 5000   | 5.3719          |
+| 5.0925        | 0.0674 | 10000  | 5.0228          |
+| 4.9633        | 0.1011 | 15000  | 4.8946          |
+| 4.909         | 0.1347 | 20000  | 4.8268          |
+| 4.844         | 0.1684 | 25000  | 4.7804          |
+| 4.8204        | 0.2021 | 30000  | 4.7456          |
+| 4.7826        | 0.2358 | 35000  | 4.7157          |
+| 4.7616        | 0.2695 | 40000  | 4.6921          |
+| 4.7328        | 0.3032 | 45000  | 4.6735          |
+| 4.7271        | 0.3368 | 50000  | 4.6575          |
+| 4.7147        | 0.3705 | 55000  | 4.6423          |
+| 4.7072        | 0.4042 | 60000  | 4.6325          |
+| 4.6978        | 0.4379 | 65000  | 4.6206          |
+| 4.6824        | 0.4716 | 70000  | 4.6131          |
+| 4.6754        | 0.5053 | 75000  | 4.6040          |
+| 4.6769        | 0.5389 | 80000  | 4.5978          |
+| 4.6631        | 0.5726 | 85000  | 4.5908          |
+| 4.6596        | 0.6063 | 90000  | 4.5845          |
+| 4.654         | 0.6400 | 95000  | 4.5789          |
+| 4.6503        | 0.6737 | 100000 | 4.5746          |
+| 4.6454        | 0.7074 | 105000 | 4.5697          |
+| 4.6497        | 0.7411 | 110000 | 4.5653          |
+| 4.6363        | 0.7747 | 115000 | 4.5563          |
+| 4.6209        | 0.8084 | 120000 | 4.5399          |
+| 4.6091        | 0.8421 | 125000 | 4.5266          |
+| 4.5895        | 0.8758 | 130000 | 4.5117          |
+| 4.5762        | 0.9095 | 135000 | 4.5010          |
+| 4.5778        | 0.9432 | 140000 | 4.4914          |
+| 4.5552        | 0.9768 | 145000 | 4.4840          |
 ### Framework versions

config.json CHANGED Viewed

@@ -46,6 +46,7 @@
   "num_key_value_heads": 2,
   "pad_token_id": 200001,
   "partial_rotary_factor": 0.25,
   "rms_norm_eps": 1e-06,
   "rope_scaling": null,
   "rope_theta": 10000.0,

   "num_key_value_heads": 2,
   "pad_token_id": 200001,
   "partial_rotary_factor": 0.25,
+  "pope_bias_init": "zero",
   "rms_norm_eps": 1e-06,
   "rope_scaling": null,
   "rope_theta": 10000.0,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ce9532e8734bac68f9273c3ac539cb2c8605260cbfb4565caa7096731a2bdc05
 size 303441424

 version https://git-lfs.github.com/spec/v1
+oid sha256:29d07219f27aabaf04c4c72023e04cf74a80693ffd561d5bd1cfb05436ff6c0b
 size 303441424

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bd09e2e3e3f4f6875c5950bc90dadcf0bca2ff91234b453d4bc60bc1e58d526f
 size 6033

 version https://git-lfs.github.com/spec/v1
+oid sha256:0f21152283bc740ca0041584baa54c17fe71b6098c364c0af1a57ed16de351d4
 size 6033