jhebmds commited on
Commit
8ab0df9
·
verified ·
1 Parent(s): 0009ab1

End of training

Browse files
Files changed (2) hide show
  1. README.md +16 -28
  2. model.safetensors +1 -1
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  base_model: gpt2
3
  tags:
4
  - generated_from_trainer
@@ -14,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 0.8839
18
 
19
  ## Model description
20
 
@@ -40,7 +41,7 @@ The following hyperparameters were used during training:
40
  - gradient_accumulation_steps: 4
41
  - total_train_batch_size: 16
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
- - lr_scheduler_type: linear
44
  - num_epochs: 20
45
  - mixed_precision_training: Native AMP
46
 
@@ -48,32 +49,19 @@ The following hyperparameters were used during training:
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-------:|:----:|:---------------:|
51
- | 2.8834 | 0.7605 | 50 | 1.3156 |
52
- | 1.2105 | 1.5209 | 100 | 1.1001 |
53
- | 1.0759 | 2.2814 | 150 | 0.9879 |
54
- | 1.0037 | 3.0418 | 200 | 0.9584 |
55
- | 0.9627 | 3.8023 | 250 | 0.9284 |
56
- | 0.933 | 4.5627 | 300 | 0.9147 |
57
- | 0.9292 | 5.3232 | 350 | 0.9126 |
58
- | 0.9207 | 6.0837 | 400 | 0.9128 |
59
- | 0.91 | 6.8441 | 450 | 0.8995 |
60
- | 0.897 | 7.6046 | 500 | 0.9023 |
61
- | 0.8979 | 8.3650 | 550 | 0.8906 |
62
- | 0.8795 | 9.1255 | 600 | 0.8899 |
63
- | 0.8788 | 9.8859 | 650 | 0.8881 |
64
- | 0.8793 | 10.6464 | 700 | 0.8880 |
65
- | 0.8647 | 11.4068 | 750 | 0.8824 |
66
- | 0.8593 | 12.1673 | 800 | 0.8874 |
67
- | 0.8551 | 12.9278 | 850 | 0.8839 |
68
- | 0.8516 | 13.6882 | 900 | 0.8820 |
69
- | 0.8409 | 14.4487 | 950 | 0.8850 |
70
- | 0.8374 | 15.2091 | 1000 | 0.8786 |
71
- | 0.834 | 15.9696 | 1050 | 0.8836 |
72
- | 0.8222 | 16.7300 | 1100 | 0.8813 |
73
- | 0.819 | 17.4905 | 1150 | 0.8844 |
74
- | 0.8142 | 18.2510 | 1200 | 0.8836 |
75
- | 0.8081 | 19.0114 | 1250 | 0.8818 |
76
- | 0.8014 | 19.7719 | 1300 | 0.8839 |
77
 
78
 
79
  ### Framework versions
 
1
  ---
2
+ license: mit
3
  base_model: gpt2
4
  tags:
5
  - generated_from_trainer
 
15
 
16
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.7807
19
 
20
  ## Model description
21
 
 
41
  - gradient_accumulation_steps: 4
42
  - total_train_batch_size: 16
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: cosine
45
  - num_epochs: 20
46
  - mixed_precision_training: Native AMP
47
 
 
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-------:|:----:|:---------------:|
52
+ | 1.9956 | 1.5209 | 100 | 1.0035 |
53
+ | 0.9713 | 3.0418 | 200 | 0.8998 |
54
+ | 0.8675 | 4.5627 | 300 | 0.8239 |
55
+ | 0.8379 | 6.0837 | 400 | 0.8214 |
56
+ | 0.8213 | 7.6046 | 500 | 0.8166 |
57
+ | 0.8093 | 9.1255 | 600 | 0.8053 |
58
+ | 0.7987 | 10.6464 | 700 | 0.8004 |
59
+ | 0.7808 | 12.1673 | 800 | 0.7963 |
60
+ | 0.7714 | 13.6882 | 900 | 0.7826 |
61
+ | 0.7563 | 15.2091 | 1000 | 0.7837 |
62
+ | 0.7496 | 16.7300 | 1100 | 0.7801 |
63
+ | 0.7416 | 18.2510 | 1200 | 0.7805 |
64
+ | 0.7368 | 19.7719 | 1300 | 0.7807 |
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
 
67
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5af32cf37a8d440ea1364437067bac5c9d8b2b34626d5d3b37410a70480b3f05
3
  size 497774208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a025e88a28e3d9ff140c16cdf631d684ce1ebf7f8f58cdab1ff4b52027e9674
3
  size 497774208