Introduction

A led-large-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers.

Allenai's Longformer Encoder-Decoder (LED).

As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.

Downloads last month: 10

Safetensors

Model size

0.5B params

Tensor type

F32

Spaces using AlgorithmicResearchGroup/led_large_16384_arxiv_summarization 4

Paper for AlgorithmicResearchGroup/led_large_16384_arxiv_summarization

Longformer: The Long-Document Transformer

Paper • 2004.05150 • Published Apr 10, 2020 • 4

Evaluation results

ROUGE-1 on ccdv/arxiv-summarization
test set verified

37.947
ROUGE-2 on ccdv/arxiv-summarization
test set verified

11.314
ROUGE-L on ccdv/arxiv-summarization
test set verified

20.556
ROUGE-LSUM on ccdv/arxiv-summarization
test set verified

33.834
loss on ccdv/arxiv-summarization
test set verified

2.806
gen_len on ccdv/arxiv-summarization
test set verified

157.417