ninagroot commited on
Commit
1d16c57
·
verified ·
1 Parent(s): e071375

ninagroot/GPT2-705Mtest

Browse files
README.md CHANGED
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Loss: 4.4882
17
 
18
  ## Model description
19
 
@@ -41,16 +41,19 @@ The following hyperparameters were used during training:
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: cosine
43
  - lr_scheduler_warmup_steps: 100
44
- - num_epochs: 3
45
  - mixed_precision_training: Native AMP
46
 
47
  ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
- | 8.8145 | 1.0 | 38 | 6.3116 |
52
- | 5.5325 | 2.0 | 76 | 4.9721 |
53
- | 4.2961 | 3.0 | 114 | 4.4882 |
 
 
 
54
 
55
 
56
  ### Framework versions
 
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Loss: 5.3517
17
 
18
  ## Model description
19
 
 
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: cosine
43
  - lr_scheduler_warmup_steps: 100
44
+ - num_epochs: 6
45
  - mixed_precision_training: Native AMP
46
 
47
  ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
+ | No log | 1.0 | 7 | 7.7792 |
52
+ | No log | 2.0 | 14 | 7.0145 |
53
+ | 7.2066 | 3.0 | 21 | 6.5189 |
54
+ | 7.2066 | 4.0 | 28 | 5.9249 |
55
+ | 7.2066 | 5.0 | 35 | 5.5873 |
56
+ | 4.7445 | 6.0 | 42 | 5.3517 |
57
 
58
 
59
  ### Framework versions
config.json CHANGED
@@ -28,5 +28,5 @@
28
  "torch_dtype": "float32",
29
  "transformers_version": "4.39.1",
30
  "use_cache": true,
31
- "vocab_size": 32000
32
  }
 
28
  "torch_dtype": "float32",
29
  "transformers_version": "4.39.1",
30
  "use_cache": true,
31
+ "vocab_size": 12198
32
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d2148fe925c8c433ed4576b5c5aaa9145d4d20210558617c0250a5d994289d04
3
- size 2918049568
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff94515d6c5b430a6342f160004710a51ec5ab1f6e330c58ed5c5e51c7fbb8e1
3
+ size 2796386080
runs/Apr16_10-21-59_gcn17.local.snellius.surf.nl/events.out.tfevents.1713255728.gcn17.local.snellius.surf.nl.2113632.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4bbe50e37a31e3f398284660d3f46c6f54e150b3f1d7a1b49cb651ce9a58384
3
+ size 7037
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5fcc2d434f0c148de9388407bcd525d9541ad1bfa66684893352ad5474b5b383
3
  size 4984
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:643a33eb85763cc56594f442ab841ece74dc15ef92a57ec40aa57f7553a67c23
3
  size 4984