ninagroot/GPT2-705Mtest

Browse files

Files changed (5) hide show

README.md +19 -19
config.json +1 -1
model.safetensors +2 -2
runs/Mar22_13-21-38_gcn31.local.snellius.surf.nl/events.out.tfevents.1711110106.gcn31.local.snellius.surf.nl.1007907.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 6.3448
 ## Model description
@@ -32,34 +32,34 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2.5e-05
-- train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 16
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 300
-- num_epochs: 15
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.74  | 2    | 8.7412          |
-| No log        | 1.86  | 5    | 8.4430          |
-| No log        | 2.98  | 8    | 7.9292          |
-| No log        | 3.72  | 10   | 7.6564          |
-| No log        | 4.84  | 13   | 7.4596          |
-| No log        | 5.95  | 16   | 7.1794          |
-| No log        | 6.7   | 18   | 7.0752          |
-| 7.7805        | 7.81  | 21   | 6.8572          |
-| 7.7805        | 8.93  | 24   | 6.6855          |
-| 7.7805        | 9.67  | 26   | 6.5781          |
-| 7.7805        | 10.79 | 29   | 6.4065          |
-| 7.7805        | 11.16 | 30   | 6.3448          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.3316
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.00025
+- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 12
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 5.8964        | 1.0   | 69   | 5.8372          |
+| 5.2016        | 2.0   | 138  | 5.0017          |
+| 4.4098        | 3.0   | 207  | 4.6658          |
+| 4.2459        | 4.0   | 276  | 4.5260          |
+| 3.9837        | 5.0   | 345  | 4.4107          |
+| 3.8526        | 6.0   | 414  | 4.3741          |
+| 3.5545        | 7.0   | 483  | 4.3328          |
+| 3.392         | 8.0   | 552  | 4.3175          |
+| 3.3396        | 9.0   | 621  | 4.3236          |
+| 3.0426        | 10.0  | 690  | 4.3322          |
+| 3.028         | 11.0  | 759  | 4.3254          |
+| 3.0344        | 12.0  | 828  | 4.3316          |
 ### Framework versions

config.json CHANGED Viewed

@@ -14,7 +14,7 @@
   "n_head": 16,
   "n_inner": null,
   "n_layer": 24,
-  "n_positions": 210,
   "pad_token_id": 0,
   "reorder_and_upcast_attn": false,
   "resid_pdrop": 0.0,

   "n_head": 16,
   "n_inner": null,
   "n_layer": 24,
+  "n_positions": 256,
   "pad_token_id": 0,
   "reorder_and_upcast_attn": false,
   "resid_pdrop": 0.0,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f97b3f20c97d0b50fdeab6a7b95d0c7b113c5a7c0f25298052f05581ecc54a96
-size 2747651872

 version https://git-lfs.github.com/spec/v1
+oid sha256:c9d94a61c5829c224b1caa5dabb4e5338d6cf4d1582545cdc03adb3494c1e8e5
+size 2747934496

runs/Mar22_13-21-38_gcn31.local.snellius.surf.nl/events.out.tfevents.1711110106.gcn31.local.snellius.surf.nl.1007907.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f6ffe578f5a7aecc0844dcf6d3f272f5a799ffbb1a2faef0a732136bb999d7f3
+size 14456

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7a605625002f9a8b862cd5bf5287017ed7500d7286426e4a0e094ae6cdd05bad
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:d61d38bc7bcccbe54e492311a8ae78f35feaf0a6d0b83a59c1f85120eb1c9361
 size 4728