rada-nlp
This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.6418
- Rouge1: 32.2628
- Rouge2: 17.6188
- Rougel: 28.3685
- Rougelsum: 28.3035
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20
Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|
| 1.8877 | 1.0 | 4 | 2.6676 | 32.1274 | 17.427 | 27.543 | 28.0416 |
| 1.8556 | 2.0 | 8 | 2.6705 | 31.2511 | 16.3095 | 26.6854 | 26.8166 |
| 1.8127 | 3.0 | 12 | 2.6705 | 31.037 | 16.0077 | 26.813 | 26.6464 |
| 1.784 | 4.0 | 16 | 2.6686 | 31.5008 | 16.2333 | 26.9957 | 26.7969 |
| 1.7672 | 5.0 | 20 | 2.6711 | 31.2118 | 15.9968 | 26.9476 | 26.9864 |
| 1.7407 | 6.0 | 24 | 2.6716 | 31.4189 | 15.9951 | 26.8681 | 26.7424 |
| 1.742 | 7.0 | 28 | 2.6701 | 30.9705 | 16.0005 | 26.5473 | 26.8081 |
| 1.7356 | 8.0 | 32 | 2.6687 | 31.906 | 17.254 | 27.7267 | 27.6687 |
| 1.7271 | 9.0 | 36 | 2.6654 | 31.8302 | 17.1851 | 27.4294 | 27.4945 |
| 1.7224 | 10.0 | 40 | 2.6606 | 31.5091 | 17.1353 | 27.8425 | 27.5751 |
| 1.7207 | 11.0 | 44 | 2.6575 | 31.6189 | 17.3582 | 27.5163 | 27.519 |
| 1.7404 | 12.0 | 48 | 2.6539 | 32.0071 | 17.1878 | 27.6051 | 27.7916 |
| 1.7213 | 13.0 | 52 | 2.6504 | 32.6314 | 17.5002 | 28.0328 | 28.0245 |
| 1.7606 | 14.0 | 56 | 2.6472 | 32.5161 | 17.4726 | 28.16 | 28.4421 |
| 1.7839 | 15.0 | 60 | 2.6444 | 32.3599 | 17.9836 | 27.9445 | 28.0023 |
| 1.812 | 16.0 | 64 | 2.6418 | 32.2628 | 17.6188 | 28.3685 | 28.3035 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 15
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support