EdinburghNLP/xsum
Viewer • Updated • 227k • 32.3k • 145
T5 with MoE (Token Choice Routing) fine-tuned on the XSUM dataset for abstractive summarization.
This model uses Sparse Mixture of Experts with learned Token Choice Top-k routing.
Key Features:
The model was trained on the XSUM dataset, which contains:
Each example consists of a BBC news article and a one-sentence summary.
from transformers import T5Tokenizer
# Load tokenizer
tokenizer = T5Tokenizer.from_pretrained("YOUR_USERNAME/t5-base-xsum-lora")
# Note: For MoE models, you need to reconstruct the architecture
# See the model repository for detailed loading instructions
Evaluate using standard ROUGE metrics and SummaC consistency scores.
The model was trained using:
If you use this model, please cite the XSUM dataset:
@inproceedings{narayan-etal-2018-dont,
title = "Don{'}t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization",
author = "Narayan, Shashi and Cohen, Shay B. and Lapata, Mirella",
booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
year = "2018",
}