alexfabbri/multi_news
Updated • 5.61k • 79
How to use Ssarion/mt5-small-multi-news with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "summarization" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("summarization", model="Ssarion/mt5-small-multi-news") # Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Ssarion/mt5-small-multi-news")
model = AutoModelForSeq2SeqLM.from_pretrained("Ssarion/mt5-small-multi-news")This model is a fine-tuned version of google/mt5-small on the multi_news dataset. It achieves the following results on the evaluation set:
Text summarization is the inteded use of this model. With further training the model could achieve better results.
For the training data we used 10000 samples from the multi-news train dataset. For the evaluation data we used 500 samples from the multi-news evaluation dataset.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|
| 5.2732 | 1.0 | 1250 | 3.2170 | 22.03 | 6.95 | 18.41 | 18.72 |
Base model
google/mt5-small