--- library_name: peft license: apache-2.0 base_model: google/roberta2roberta_L-24_cnn_daily_mail tags: - generated_from_trainer metrics: - rouge model-index: - name: results results: [] --- # results This model is a fine-tuned version of [google/roberta2roberta_L-24_cnn_daily_mail](https://huggingface.co/google/roberta2roberta_L-24_cnn_daily_mail) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 6.7395 - Rouge1: 33.11 - Rouge2: 20.39 - Rougel: 27.32 - Rougelsum: 27.42 - Bertscore P: 87.57 - Bertscore R: 83.35 - Bertscore F1: 85.27 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 8 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bertscore P | Bertscore R | Bertscore F1 | |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-----------:|:-----------:|:------------:| | 36.5558 | 0.8 | 20 | 7.8969 | 32.35 | 20.37 | 27.74 | 27.82 | 87.74 | 83.51 | 85.43 | | 32.2661 | 1.6 | 40 | 7.0747 | 34.94 | 21.86 | 29.83 | 30.04 | 87.7 | 83.93 | 85.62 | | 30.3129 | 2.4 | 60 | 6.7395 | 33.11 | 20.39 | 27.32 | 27.42 | 87.57 | 83.35 | 85.27 | ### Framework versions - PEFT 0.14.0 - Transformers 4.48.3 - Pytorch 2.5.1+cu124 - Datasets 3.3.0 - Tokenizers 0.21.0