File size: 3,857 Bytes
8054d2a
 
 
f50505a
8054d2a
 
 
 
 
f50505a
 
8054d2a
 
 
f50505a
8054d2a
81043aa
8054d2a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47d0310
f50505a
 
 
 
 
 
 
81043aa
f50505a
 
8054d2a
 
 
f50505a
 
81043aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8054d2a
 
 
 
905323f
f50505a
8054d2a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
license: mit
tags:
- generated_from_trainer
model-index:
- name: Romance-cleaned-1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Romance-cleaned-1

This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 4.7175

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 50
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| No log        | 0.97  | 29   | 9.9497          |
| No log        | 1.97  | 58   | 9.1816          |
| No log        | 2.97  | 87   | 8.5947          |
| No log        | 3.97  | 116  | 8.2217          |
| No log        | 4.97  | 145  | 7.8354          |
| No log        | 5.97  | 174  | 7.5075          |
| No log        | 6.97  | 203  | 7.2112          |
| No log        | 7.97  | 232  | 6.9077          |
| No log        | 8.97  | 261  | 6.5994          |
| No log        | 9.97  | 290  | 6.3077          |
| No log        | 10.97 | 319  | 6.0416          |
| No log        | 11.97 | 348  | 5.8126          |
| No log        | 12.97 | 377  | 5.6197          |
| No log        | 13.97 | 406  | 5.4789          |
| No log        | 14.97 | 435  | 5.3665          |
| No log        | 15.97 | 464  | 5.2738          |
| No log        | 16.97 | 493  | 5.1942          |
| No log        | 17.97 | 522  | 5.1382          |
| No log        | 18.97 | 551  | 5.0784          |
| No log        | 19.97 | 580  | 5.0347          |
| No log        | 20.97 | 609  | 4.9873          |
| No log        | 21.97 | 638  | 4.9514          |
| No log        | 22.97 | 667  | 4.9112          |
| No log        | 23.97 | 696  | 4.8838          |
| No log        | 24.97 | 725  | 4.8468          |
| No log        | 25.97 | 754  | 4.8221          |
| No log        | 26.97 | 783  | 4.7996          |
| No log        | 27.97 | 812  | 4.7815          |
| No log        | 28.97 | 841  | 4.7606          |
| No log        | 29.97 | 870  | 4.7394          |
| No log        | 30.97 | 899  | 4.7167          |
| No log        | 31.97 | 928  | 4.7140          |
| No log        | 32.97 | 957  | 4.6910          |
| No log        | 33.97 | 986  | 4.6844          |
| No log        | 34.97 | 1015 | 4.6765          |
| No log        | 35.97 | 1044 | 4.6687          |
| No log        | 36.97 | 1073 | 4.6721          |
| No log        | 37.97 | 1102 | 4.6724          |
| No log        | 38.97 | 1131 | 4.6629          |
| No log        | 39.97 | 1160 | 4.6772          |
| No log        | 40.97 | 1189 | 4.6795          |
| No log        | 41.97 | 1218 | 4.6788          |
| No log        | 42.97 | 1247 | 4.6832          |
| No log        | 43.97 | 1276 | 4.6954          |
| No log        | 44.97 | 1305 | 4.7009          |
| No log        | 45.97 | 1334 | 4.7082          |
| No log        | 46.97 | 1363 | 4.7140          |
| No log        | 47.97 | 1392 | 4.7158          |
| No log        | 48.97 | 1421 | 4.7181          |
| No log        | 49.97 | 1450 | 4.7175          |


### Framework versions

- Transformers 4.24.0
- Pytorch 1.12.1+cu113
- Datasets 2.6.1
- Tokenizers 0.13.1