José Ángel González
commited on
Commit
·
b2a2542
1
Parent(s):
0121c8c
Update README.md
Browse files
README.md
CHANGED
|
@@ -11,3 +11,20 @@ widget:
|
|
| 11 |
News Abstractive Summarization for Spanish (NASES) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization of Spanish news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four pre-training tasks have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Spanish newspapers, and Wikipedia articles in Spanish were used for pre-training the model (21GB of raw text -8.5 millions of documents-).
|
| 12 |
|
| 13 |
NASES is finetuned for the summarization task on 1.802.919 (document, summary) pairs from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
News Abstractive Summarization for Spanish (NASES) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization of Spanish news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four pre-training tasks have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Spanish newspapers, and Wikipedia articles in Spanish were used for pre-training the model (21GB of raw text -8.5 millions of documents-).
|
| 12 |
|
| 13 |
NASES is finetuned for the summarization task on 1.802.919 (document, summary) pairs from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA).
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
More details about the pretraining/finetuning datasets and the models soon:
|
| 17 |
+
|
| 18 |
+
@unpublished{DACSA,
|
| 19 |
+
author = "Vicent Ahuir, Lluís-F. Hurtado , José Ángel González and Encarna Segarra",
|
| 20 |
+
title = "DACSA: a Dataset for Automatic summarization of Catalan and Spanish
|
| 21 |
+
newspaper Articles",
|
| 22 |
+
note = "Unsubmitted",
|
| 23 |
+
}
|
| 24 |
+
|
| 25 |
+
@unpublished{NAS,
|
| 26 |
+
author = "Vicent Ahuir, Lluís-F. Hurtado , José Ángel González and Encarna Segarra",
|
| 27 |
+
title = "NASCA and NASES : Two monolingual pre-trained models for
|
| 28 |
+
abstractive summarization in Catalan and Spanish",
|
| 29 |
+
note = "Submitted to the Special Issue on Current Approaches and Applications in Natural Language Processing (Applied Sciences)",
|
| 30 |
+
}
|