Model save

Browse files

Files changed (4) hide show

README.md +36 -55
adapter_model.safetensors +1 -1
all_results.json +6 -6
train_results.json +6 -6

README.md CHANGED Viewed

@@ -5,18 +5,18 @@ base_model: gpt2
 tags:
 - generated_from_trainer
 model-index:
-- name: Se124M100KInfPrompt_endtoken
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# Se124M100KInfPrompt_endtoken
 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6957
 ## Model description
@@ -46,58 +46,39 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step  | Validation Loss |
-|:-------------:|:-----:|:-----:|:---------------:|
-| 0.2129        | 1.0   | 1430  | 0.8043          |
-| 0.2018        | 2.0   | 2860  | 0.7705          |
-| 0.1949        | 3.0   | 4290  | 0.7588          |
-| 0.1913        | 4.0   | 5720  | 0.7498          |
-| 0.1921        | 5.0   | 7150  | 0.7435          |
-| 0.1903        | 6.0   | 8580  | 0.7371          |
-| 0.1888        | 7.0   | 10010 | 0.7339          |
-| 0.1881        | 8.0   | 11440 | 0.7299          |
-| 0.1872        | 9.0   | 12870 | 0.7267          |
-| 0.187         | 10.0  | 14300 | 0.7251          |
-| 0.184         | 11.0  | 15730 | 0.7229          |
-| 0.1846        | 12.0  | 17160 | 0.7212          |
-| 0.1851        | 13.0  | 18590 | 0.7182          |
-| 0.1804        | 14.0  | 20020 | 0.7153          |
-| 0.1848        | 15.0  | 21450 | 0.7141          |
-| 0.1824        | 16.0  | 22880 | 0.7144          |
-| 0.1796        | 17.0  | 24310 | 0.7116          |
-| 0.18          | 18.0  | 25740 | 0.7108          |
-| 0.1825        | 19.0  | 27170 | 0.7082          |
-| 0.1852        | 20.0  | 28600 | 0.7082          |
-| 0.1785        | 21.0  | 30030 | 0.7072          |
-| 0.1811        | 22.0  | 31460 | 0.7057          |
-| 0.178         | 23.0  | 32890 | 0.7059          |
-| 0.1827        | 24.0  | 34320 | 0.7046          |
-| 0.1813        | 25.0  | 35750 | 0.7033          |
-| 0.1825        | 26.0  | 37180 | 0.7039          |
-| 0.1795        | 27.0  | 38610 | 0.7032          |
-| 0.1801        | 28.0  | 40040 | 0.7017          |
-| 0.1781        | 29.0  | 41470 | 0.7013          |
-| 0.1823        | 30.0  | 42900 | 0.7010          |
-| 0.1781        | 31.0  | 44330 | 0.7012          |
-| 0.1809        | 32.0  | 45760 | 0.6999          |
-| 0.1764        | 33.0  | 47190 | 0.6996          |
-| 0.1791        | 34.0  | 48620 | 0.6983          |
-| 0.1793        | 35.0  | 50050 | 0.6988          |
-| 0.1785        | 36.0  | 51480 | 0.6980          |
-| 0.1777        | 37.0  | 52910 | 0.6980          |
-| 0.1774        | 38.0  | 54340 | 0.6980          |
-| 0.1795        | 39.0  | 55770 | 0.6976          |
-| 0.1772        | 40.0  | 57200 | 0.6974          |
-| 0.1793        | 41.0  | 58630 | 0.6974          |
-| 0.1777        | 42.0  | 60060 | 0.6968          |
-| 0.1777        | 43.0  | 61490 | 0.6965          |
-| 0.1779        | 44.0  | 62920 | 0.6965          |
-| 0.1782        | 45.0  | 64350 | 0.6964          |
-| 0.1765        | 46.0  | 65780 | 0.6961          |
-| 0.1758        | 47.0  | 67210 | 0.6962          |
-| 0.1763        | 48.0  | 68640 | 0.6960          |
-| 0.1788        | 49.0  | 70070 | 0.6958          |
-| 0.1776        | 50.0  | 71500 | 0.6957          |
 ### Framework versions

 tags:
 - generated_from_trainer
 model-index:
+- name: Se124M500KInfPrompt_endtoken
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# Se124M500KInfPrompt_endtoken
 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6719
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step   | Validation Loss |
+|:-------------:|:-----:|:------:|:---------------:|
+| 0.1898        | 1.0   | 5427   | 0.7433          |
+| 0.1857        | 2.0   | 10854  | 0.7238          |
+| 0.1843        | 3.0   | 16281  | 0.7118          |
+| 0.1813        | 4.0   | 21708  | 0.7045          |
+| 0.1802        | 5.0   | 27135  | 0.6990          |
+| 0.1785        | 6.0   | 32562  | 0.6944          |
+| 0.1769        | 7.0   | 37989  | 0.6918          |
+| 0.1743        | 8.0   | 43416  | 0.6875          |
+| 0.1752        | 9.0   | 48843  | 0.6854          |
+| 0.1756        | 10.0  | 54270  | 0.6854          |
+| 0.1736        | 11.0  | 59697  | 0.6837          |
+| 0.1756        | 12.0  | 65124  | 0.6812          |
+| 0.173         | 13.0  | 70551  | 0.6798          |
+| 0.1737        | 14.0  | 75978  | 0.6791          |
+| 0.1741        | 15.0  | 81405  | 0.6783          |
+| 0.177         | 16.0  | 86832  | 0.6771          |
+| 0.1734        | 17.0  | 92259  | 0.6765          |
+| 0.1719        | 18.0  | 97686  | 0.6760          |
+| 0.1737        | 19.0  | 103113 | 0.6763          |
+| 0.1716        | 20.0  | 108540 | 0.6747          |
+| 0.1713        | 21.0  | 113967 | 0.6741          |
+| 0.1739        | 22.0  | 119394 | 0.6738          |
+| 0.1694        | 23.0  | 124821 | 0.6737          |
+| 0.1703        | 24.0  | 130248 | 0.6743          |
+| 0.1697        | 25.0  | 135675 | 0.6730          |
+| 0.172         | 26.0  | 141102 | 0.6731          |
+| 0.1711        | 27.0  | 146529 | 0.6720          |
+| 0.1726        | 28.0  | 151956 | 0.6720          |
+| 0.1703        | 29.0  | 157383 | 0.6716          |
+| 0.1732        | 30.0  | 162810 | 0.6716          |
+| 0.171         | 31.0  | 168237 | 0.6719          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9181e691728283da052e440eb03c026ec4bd673683dbd9545d920d8b44e50a3a
 size 309974336

 version https://git-lfs.github.com/spec/v1
+oid sha256:963ff76e5201583684dcef52121403c7aa5c0888a7400872b43677f42ec1461d
 size 309974336

all_results.json CHANGED Viewed

@@ -1,13 +1,13 @@
 {
-    "epoch": 50.0,
     "eval_loss": 0.6956692934036255,
     "eval_runtime": 59.8322,
     "eval_samples_per_second": 163.257,
     "eval_steps_per_second": 5.114,
     "perplexity": 2.005050592091695,
-    "total_flos": 1.49878932701184e+17,
-    "train_loss": 0.18437957987751993,
-    "train_runtime": 7319.2549,
-    "train_samples_per_second": 312.395,
-    "train_steps_per_second": 9.769
 }

 {
+    "epoch": 31.0,
     "eval_loss": 0.6956692934036255,
     "eval_runtime": 59.8322,
     "eval_samples_per_second": 163.257,
     "eval_steps_per_second": 5.114,
     "perplexity": 2.005050592091695,
+    "total_flos": 3.528546650263388e+17,
+    "train_loss": 0.17647521918996942,
+    "train_runtime": 17128.7716,
+    "train_samples_per_second": 506.884,
+    "train_steps_per_second": 15.842
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 50.0,
-    "total_flos": 1.49878932701184e+17,
-    "train_loss": 0.18437957987751993,
-    "train_runtime": 7319.2549,
-    "train_samples_per_second": 312.395,
-    "train_steps_per_second": 9.769
 }

 {
+    "epoch": 31.0,
+    "total_flos": 3.528546650263388e+17,
+    "train_loss": 0.17647521918996942,
+    "train_runtime": 17128.7716,
+    "train_samples_per_second": 506.884,
+    "train_steps_per_second": 15.842
 }