| language: en | |
| license: apache-2.0 | |
| tags: | |
| - summarization | |
| datasets: arxiv-summarization | |
| model-index: | |
| - name: ArtifactAI/led_base_16384_arxiv_summarization | |
| results: | |
| - task: | |
| type: summarization | |
| name: Summarization | |
| dataset: | |
| name: ccdv/arxiv-summarization | |
| type: ccdv/arxiv-summarization | |
| config: section | |
| split: test | |
| metrics: | |
| - type: rouge | |
| value: 37.3255 | |
| name: ROUGE-1 | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODIyZWNhMTgxNDlkYThhNGFlOGIxYjhhMTU4Y2JjN2I2ZDVkYWVhMmU5ZjQxZmQ3ZGY4ZmY1Y2Y2YzYwZjg5MCIsInZlcnNpb24iOjF9.Q5rZaUa1WvJThE1dOVOWEAOTweDkQPilaP9OCdM1W7ypC-XVTrKC-XjeYvgpET8GSqMROoYP9Z0oJdD1KcWeCw | |
| - type: rouge | |
| value: 10.8948 | |
| name: ROUGE-2 | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYmJkN2E4YzE3MWY0ZTg4YjFkMGY5MjY2YjhmYzBjZGU3Mjc2NjNhYzkwMDkwOTMwNjdmYWI1ZmY2YmQ3OTA2MiIsInZlcnNpb24iOjF9.u9SrzD-QRXU2mboRwkhgyJcDGPfZoGY5vCoC4ROUc2WLB9IcSypzCAfGsIg488aWJ-iGUmfwbGQqj8Vb50mmCA | |
| - type: rouge | |
| value: 20.3875 | |
| name: ROUGE-L | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzk5NTM4MjM0MGU0OTdmZWEzMjhkMmUxMTY3YTVmMzUzODllZWEwMWEwNjE5ZWNiYzY0MjM1MTFlZWE3NmNmNiIsInZlcnNpb24iOjF9.tJxNOMKwjJlTVhcjoLdy8phj4cSG3b5YaQd5vzl9RJc-kCLcC7Q_F7LDYlEFa7L2S04b6YAcn1JzPsCNy9avAA | |
| - type: rouge | |
| value: 33.3014 | |
| name: ROUGE-LSUM | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZDVlZjZhMWNlZmE1YmQ1NDQ4ZmIyMzU5YjgxZmE5ZDEzYWJlNDBiODJjZDBhZWYyMmJhYmE4MWQ3ZGE4ZDUxMCIsInZlcnNpb24iOjF9.NGxXK6cEvyIia_iCjuIeR_JL0fKNONDmnaPKslwf56p7Hletg44oi17jM7LIkZ6ToZb31vvcKjx2DO4-k1V0CQ | |
| - type: loss | |
| value: 3.182162284851074 | |
| name: loss | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiM2U1YjkzMmIyNmEzYjlhNWZkNTNmMjgzNjZlMmY2ZWY1OGIyNzM2YmU1MzdiMDAxZDVmNmE5OGNiYThlNTA4ZiIsInZlcnNpb24iOjF9.CeWkK2aAodOUyj7omgJ0sq66GDTuEBRIuDOxLCkw6h1UshWCY2KT-uCUNcQfKMIvPaEjqIKjvtbWBKkmHipHAw | |
| - type: gen_len | |
| value: 145.5905 | |
| name: gen_len | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZDMzNTIzZTcwMzczNGEzMmU4YjAyOTZhNDVlNmUyNzVjZGE5MjNhYzQ4MGNlYmQ4MjNlOWY4YzY0NDExNDhiZCIsInZlcnNpb24iOjF9.fX3AuS-fWZfYe5KPDr8FSxuVZYwcUKglSIhKYIVdwTsfXgUVTdDzC6wBiBRpS3ybW0yFSxlKnAbBdJEshOpDBw | |
| ## Introduction | |
| A led-base-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers. | |
| [Allenai's Longformer Encoder-Decoder (LED)](https://github.com/allenai/longformer#longformer). | |
| As described in [Longformer: The Long-Document Transformer](https://arxiv.org/pdf/2004.05150.pdf) by Iz Beltagy, Matthew E. Peters, Arman Cohan, *led-base-16384* was initialized from [*bart-base*](https://huggingface.co/facebook/bart-base) since both models share the exact same architecture. To be able to process 16K tokens, *bart-base*'s position embedding matrix was simply copied 16 times. | |
| ### Rouge 2 | |
| | Type | Score | | |
| | --- | --- | | |
| | `precision` | 0.1839148953011932 | | |
| | `recall` | 0.14904707945189774 | | |
| | `fmeasure` | 0.1580026685776864 | |