File size: 3,787 Bytes
b160fe0
 
 
 
 
 
 
bb1886f
b160fe0
 
 
 
 
 
bb1886f
b160fe0
 
 
a389c5c
b160fe0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb1886f
 
 
b160fe0
 
bb1886f
b160fe0
 
 
 
 
bcb4878
 
bb1886f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b160fe0
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
library_name: peft
license: mit
base_model: gpt2
tags:
- generated_from_trainer
model-index:
- name: Se124M10KInfPrompt
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Se124M10KInfPrompt

This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.7128

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 0.4014        | 1.0   | 267   | 1.0141          |
| 0.2422        | 2.0   | 534   | 0.8523          |
| 0.2202        | 3.0   | 801   | 0.8168          |
| 0.2129        | 4.0   | 1068  | 0.7993          |
| 0.2059        | 5.0   | 1335  | 0.7837          |
| 0.2041        | 6.0   | 1602  | 0.7695          |
| 0.2031        | 7.0   | 1869  | 0.7635          |
| 0.1982        | 8.0   | 2136  | 0.7586          |
| 0.1975        | 9.0   | 2403  | 0.7532          |
| 0.1974        | 10.0  | 2670  | 0.7483          |
| 0.1978        | 11.0  | 2937  | 0.7467          |
| 0.1939        | 12.0  | 3204  | 0.7445          |
| 0.1953        | 13.0  | 3471  | 0.7439          |
| 0.1929        | 14.0  | 3738  | 0.7362          |
| 0.1937        | 15.0  | 4005  | 0.7328          |
| 0.1934        | 16.0  | 4272  | 0.7329          |
| 0.1927        | 17.0  | 4539  | 0.7323          |
| 0.1927        | 18.0  | 4806  | 0.7257          |
| 0.1909        | 19.0  | 5073  | 0.7276          |
| 0.1919        | 20.0  | 5340  | 0.7251          |
| 0.1919        | 21.0  | 5607  | 0.7239          |
| 0.1912        | 22.0  | 5874  | 0.7260          |
| 0.1897        | 23.0  | 6141  | 0.7241          |
| 0.1916        | 24.0  | 6408  | 0.7235          |
| 0.1905        | 25.0  | 6675  | 0.7225          |
| 0.1919        | 26.0  | 6942  | 0.7188          |
| 0.1883        | 27.0  | 7209  | 0.7207          |
| 0.1898        | 28.0  | 7476  | 0.7198          |
| 0.1874        | 29.0  | 7743  | 0.7195          |
| 0.188         | 30.0  | 8010  | 0.7194          |
| 0.1873        | 31.0  | 8277  | 0.7182          |
| 0.1878        | 32.0  | 8544  | 0.7212          |
| 0.1866        | 33.0  | 8811  | 0.7171          |
| 0.1883        | 34.0  | 9078  | 0.7151          |
| 0.1881        | 35.0  | 9345  | 0.7176          |
| 0.1868        | 36.0  | 9612  | 0.7149          |
| 0.1871        | 37.0  | 9879  | 0.7157          |
| 0.1876        | 38.0  | 10146 | 0.7162          |
| 0.188         | 39.0  | 10413 | 0.7142          |
| 0.1861        | 40.0  | 10680 | 0.7149          |
| 0.1862        | 41.0  | 10947 | 0.7144          |
| 0.1862        | 42.0  | 11214 | 0.7128          |
| 0.186         | 43.0  | 11481 | 0.7136          |
| 0.1868        | 44.0  | 11748 | 0.7137          |
| 0.1837        | 45.0  | 12015 | 0.7138          |
| 0.1868        | 46.0  | 12282 | 0.7141          |
| 0.187         | 47.0  | 12549 | 0.7133          |


### Framework versions

- PEFT 0.15.1
- Transformers 4.51.3
- Pytorch 2.6.0+cu118
- Datasets 3.5.0
- Tokenizers 0.21.1