wikitext-103-raw-v1-sent-permute-9_default_gpt2
This model is a fine-tuned version of gpt2 on the tyzhu/wikitext-103-raw-v1-sent-permute-9 default dataset. It achieves the following results on the evaluation set:
- Loss: 2.8110
- Accuracy: 0.4528
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 3.1679 | 0.05 | 886 | 2.9677 | 0.4343 |
| 3.1104 | 0.1 | 1772 | 2.9295 | 0.4389 |
| 3.0801 | 0.15 | 2658 | 2.9060 | 0.4415 |
| 3.0538 | 0.2 | 3544 | 2.8885 | 0.4437 |
| 3.038 | 0.25 | 4430 | 2.8755 | 0.4451 |
| 3.0205 | 0.3 | 5316 | 2.8642 | 0.4464 |
| 3.007 | 0.35 | 6202 | 2.8554 | 0.4475 |
| 2.9944 | 0.4 | 7088 | 2.8482 | 0.4486 |
| 2.9904 | 0.45 | 7974 | 2.8409 | 0.4495 |
| 2.984 | 0.5 | 8860 | 2.8355 | 0.4502 |
| 2.9783 | 0.55 | 9746 | 2.8307 | 0.4502 |
| 2.973 | 0.6 | 10632 | 2.8271 | 0.4508 |
| 2.9652 | 0.65 | 11518 | 2.8228 | 0.4516 |
| 2.9645 | 0.7 | 12404 | 2.8203 | 0.4521 |
| 2.9609 | 0.75 | 13290 | 2.8178 | 0.4524 |
| 2.9583 | 0.8 | 14176 | 2.8158 | 0.4524 |
| 2.9548 | 0.85 | 15062 | 2.8135 | 0.4525 |
| 2.9497 | 0.9 | 15948 | 2.8122 | 0.4529 |
| 2.9516 | 0.95 | 16834 | 2.8113 | 0.4527 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.5
- Tokenizers 0.14.1
- Downloads last month
- 6
Model tree for tyzhu/wikitext-103-raw-v1-sent-permute-9_default_gpt2
Base model
openai-community/gpt2Dataset used to train tyzhu/wikitext-103-raw-v1-sent-permute-9_default_gpt2
Evaluation results
- Accuracy on tyzhu/wikitext-103-raw-v1-sent-permute-9 defaultself-reported0.453