File size: 4,020 Bytes
b9ba806 e97eb55 b9ba806 dd3d671 fe44466 bddf188 99c246d 5ca7f2c aafa484 8e245a9 2b5f752 6753c05 225cdf5 af95f82 8056ee3 0c7f835 6e00395 f286b18 be6b9d8 59dddee 15242ac dc437e4 c549cde f889a8e 1418f88 0da6676 3095a07 3e80420 b3bde55 fdff334 4cada4c 0ba4e00 c40e243 2e92b09 caa3175 4397127 8419bef 28db4f3 8436434 b6ba3b7 8943a11 4346550 9432c7c 58a9b69 b1c429b 31c97ef 9cc1012 d67e268 fcc39d0 9a82bd5 d4c0cec bef140e 5a85dbb 35be3fb 43b4292 a206e74 322eb56 9f142e3 9b3f696 3df5a24 0f994a7 e97eb55 b9ba806 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
---
license: mit
tags:
- generated_from_keras_callback
model-index:
- name: ghdi/punic-model
results: []
---
<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->
# ghdi/punic-model
This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 3.9858
- Validation Loss: 7.6193
- Epoch: 59
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -984, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: mixed_float16
### Training results
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 10.9100 | 10.8188 | 0 |
| 10.7129 | 10.4690 | 1 |
| 10.3775 | 10.1048 | 2 |
| 10.0587 | 9.8271 | 3 |
| 9.8034 | 9.6395 | 4 |
| 9.6209 | 9.5085 | 5 |
| 9.5047 | 9.4043 | 6 |
| 9.3724 | 9.3072 | 7 |
| 9.2873 | 9.2090 | 8 |
| 9.1690 | 9.1091 | 9 |
| 8.9963 | 9.0013 | 10 |
| 8.8724 | 8.8875 | 11 |
| 8.7316 | 8.7701 | 12 |
| 8.6070 | 8.6477 | 13 |
| 8.4242 | 8.5243 | 14 |
| 8.2700 | 8.4018 | 15 |
| 8.1555 | 8.2834 | 16 |
| 7.9978 | 8.1696 | 17 |
| 7.8495 | 8.0607 | 18 |
| 7.6980 | 7.9635 | 19 |
| 7.5339 | 7.8726 | 20 |
| 7.4741 | 7.7917 | 21 |
| 7.3669 | 7.7233 | 22 |
| 7.2598 | 7.6604 | 23 |
| 7.1434 | 7.6088 | 24 |
| 7.0434 | 7.5579 | 25 |
| 6.9874 | 7.5171 | 26 |
| 6.8629 | 7.4881 | 27 |
| 6.8293 | 7.4694 | 28 |
| 6.6349 | 7.4367 | 29 |
| 6.7589 | 7.4071 | 30 |
| 6.5890 | 7.4003 | 31 |
| 6.5476 | 7.3576 | 32 |
| 6.4606 | 7.3400 | 33 |
| 6.3945 | 7.3327 | 34 |
| 6.2495 | 7.3435 | 35 |
| 6.0722 | 7.3375 | 36 |
| 6.1324 | 7.3365 | 37 |
| 6.0493 | 7.3458 | 38 |
| 5.9514 | 7.4002 | 39 |
| 5.8638 | 7.3356 | 40 |
| 5.7390 | 7.3488 | 41 |
| 5.6403 | 7.3687 | 42 |
| 5.5442 | 7.3831 | 43 |
| 5.4542 | 7.3888 | 44 |
| 5.3243 | 7.4340 | 45 |
| 5.2295 | 7.4170 | 46 |
| 5.1436 | 7.4110 | 47 |
| 5.0199 | 7.5223 | 48 |
| 4.9058 | 7.5142 | 49 |
| 4.8393 | 7.4926 | 50 |
| 4.7104 | 7.5253 | 51 |
| 4.6212 | 7.5420 | 52 |
| 4.5298 | 7.5799 | 53 |
| 4.4251 | 7.5940 | 54 |
| 4.3130 | 7.5752 | 55 |
| 4.2240 | 7.6315 | 56 |
| 4.1587 | 7.6412 | 57 |
| 4.0442 | 7.6748 | 58 |
| 3.9858 | 7.6193 | 59 |
### Framework versions
- Transformers 4.28.1
- TensorFlow 2.12.0
- Datasets 2.11.0
- Tokenizers 0.13.3
|