File size: 4,020 Bytes
b9ba806
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e97eb55
 
 
b9ba806
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dd3d671
fe44466
bddf188
99c246d
5ca7f2c
aafa484
8e245a9
2b5f752
6753c05
225cdf5
af95f82
8056ee3
0c7f835
6e00395
f286b18
be6b9d8
59dddee
15242ac
dc437e4
c549cde
f889a8e
1418f88
0da6676
3095a07
3e80420
b3bde55
fdff334
4cada4c
0ba4e00
c40e243
2e92b09
caa3175
4397127
8419bef
28db4f3
8436434
b6ba3b7
8943a11
4346550
9432c7c
58a9b69
b1c429b
31c97ef
9cc1012
d67e268
fcc39d0
9a82bd5
d4c0cec
bef140e
5a85dbb
35be3fb
43b4292
a206e74
322eb56
9f142e3
9b3f696
3df5a24
0f994a7
e97eb55
b9ba806
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
license: mit
tags:
- generated_from_keras_callback
model-index:
- name: ghdi/punic-model
  results: []
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# ghdi/punic-model

This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 3.9858
- Validation Loss: 7.6193
- Epoch: 59

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -984, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: mixed_float16

### Training results

| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 10.9100    | 10.8188         | 0     |
| 10.7129    | 10.4690         | 1     |
| 10.3775    | 10.1048         | 2     |
| 10.0587    | 9.8271          | 3     |
| 9.8034     | 9.6395          | 4     |
| 9.6209     | 9.5085          | 5     |
| 9.5047     | 9.4043          | 6     |
| 9.3724     | 9.3072          | 7     |
| 9.2873     | 9.2090          | 8     |
| 9.1690     | 9.1091          | 9     |
| 8.9963     | 9.0013          | 10    |
| 8.8724     | 8.8875          | 11    |
| 8.7316     | 8.7701          | 12    |
| 8.6070     | 8.6477          | 13    |
| 8.4242     | 8.5243          | 14    |
| 8.2700     | 8.4018          | 15    |
| 8.1555     | 8.2834          | 16    |
| 7.9978     | 8.1696          | 17    |
| 7.8495     | 8.0607          | 18    |
| 7.6980     | 7.9635          | 19    |
| 7.5339     | 7.8726          | 20    |
| 7.4741     | 7.7917          | 21    |
| 7.3669     | 7.7233          | 22    |
| 7.2598     | 7.6604          | 23    |
| 7.1434     | 7.6088          | 24    |
| 7.0434     | 7.5579          | 25    |
| 6.9874     | 7.5171          | 26    |
| 6.8629     | 7.4881          | 27    |
| 6.8293     | 7.4694          | 28    |
| 6.6349     | 7.4367          | 29    |
| 6.7589     | 7.4071          | 30    |
| 6.5890     | 7.4003          | 31    |
| 6.5476     | 7.3576          | 32    |
| 6.4606     | 7.3400          | 33    |
| 6.3945     | 7.3327          | 34    |
| 6.2495     | 7.3435          | 35    |
| 6.0722     | 7.3375          | 36    |
| 6.1324     | 7.3365          | 37    |
| 6.0493     | 7.3458          | 38    |
| 5.9514     | 7.4002          | 39    |
| 5.8638     | 7.3356          | 40    |
| 5.7390     | 7.3488          | 41    |
| 5.6403     | 7.3687          | 42    |
| 5.5442     | 7.3831          | 43    |
| 5.4542     | 7.3888          | 44    |
| 5.3243     | 7.4340          | 45    |
| 5.2295     | 7.4170          | 46    |
| 5.1436     | 7.4110          | 47    |
| 5.0199     | 7.5223          | 48    |
| 4.9058     | 7.5142          | 49    |
| 4.8393     | 7.4926          | 50    |
| 4.7104     | 7.5253          | 51    |
| 4.6212     | 7.5420          | 52    |
| 4.5298     | 7.5799          | 53    |
| 4.4251     | 7.5940          | 54    |
| 4.3130     | 7.5752          | 55    |
| 4.2240     | 7.6315          | 56    |
| 4.1587     | 7.6412          | 57    |
| 4.0442     | 7.6748          | 58    |
| 3.9858     | 7.6193          | 59    |


### Framework versions

- Transformers 4.28.1
- TensorFlow 2.12.0
- Datasets 2.11.0
- Tokenizers 0.13.3