polieste commited on
Commit
ab1ce0e
·
1 Parent(s): ece31aa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -81
README.md CHANGED
@@ -105,10 +105,6 @@ model-index:
105
  verified: true
106
  ---
107
 
108
- # mT5-multilingual-XLSum
109
-
110
- This repository contains the mT5 checkpoint finetuned on the 45 languages of [XL-Sum](https://huggingface.co/datasets/csebuetnlp/xlsum) dataset. For finetuning details and scripts,
111
- see the [paper](https://aclanthology.org/2021.findings-acl.413/) and the [official repository](https://github.com/csebuetnlp/xl-sum).
112
 
113
 
114
  ## Using this model in `transformers` (tested on 4.11.0.dev0)
@@ -149,80 +145,3 @@ summary = tokenizer.decode(
149
  print(summary)
150
  ```
151
 
152
- ## Benchmarks
153
-
154
- Scores on the XL-Sum test sets are as follows:
155
-
156
- Language | ROUGE-1 / ROUGE-2 / ROUGE-L
157
- ---------|----------------------------
158
- Amharic | 20.0485 / 7.4111 / 18.0753
159
- Arabic | 34.9107 / 14.7937 / 29.1623
160
- Azerbaijani | 21.4227 / 9.5214 / 19.3331
161
- Bengali | 29.5653 / 12.1095 / 25.1315
162
- Burmese | 15.9626 / 5.1477 / 14.1819
163
- Chinese (Simplified) | 39.4071 / 17.7913 / 33.406
164
- Chinese (Traditional) | 37.1866 / 17.1432 / 31.6184
165
- English | 37.601 / 15.1536 / 29.8817
166
- French | 35.3398 / 16.1739 / 28.2041
167
- Gujarati | 21.9619 / 7.7417 / 19.86
168
- Hausa | 39.4375 / 17.6786 / 31.6667
169
- Hindi | 38.5882 / 16.8802 / 32.0132
170
- Igbo | 31.6148 / 10.1605 / 24.5309
171
- Indonesian | 37.0049 / 17.0181 / 30.7561
172
- Japanese | 48.1544 / 23.8482 / 37.3636
173
- Kirundi | 31.9907 / 14.3685 / 25.8305
174
- Korean | 23.6745 / 11.4478 / 22.3619
175
- Kyrgyz | 18.3751 / 7.9608 / 16.5033
176
- Marathi | 22.0141 / 9.5439 / 19.9208
177
- Nepali | 26.6547 / 10.2479 / 24.2847
178
- Oromo | 18.7025 / 6.1694 / 16.1862
179
- Pashto | 38.4743 / 15.5475 / 31.9065
180
- Persian | 36.9425 / 16.1934 / 30.0701
181
- Pidgin | 37.9574 / 15.1234 / 29.872
182
- Portuguese | 37.1676 / 15.9022 / 28.5586
183
- Punjabi | 30.6973 / 12.2058 / 25.515
184
- Russian | 32.2164 / 13.6386 / 26.1689
185
- Scottish Gaelic | 29.0231 / 10.9893 / 22.8814
186
- Serbian (Cyrillic) | 23.7841 / 7.9816 / 20.1379
187
- Serbian (Latin) | 21.6443 / 6.6573 / 18.2336
188
- Sinhala | 27.2901 / 13.3815 / 23.4699
189
- Somali | 31.5563 / 11.5818 / 24.2232
190
- Spanish | 31.5071 / 11.8767 / 24.0746
191
- Swahili | 37.6673 / 17.8534 / 30.9146
192
- Tamil | 24.3326 / 11.0553 / 22.0741
193
- Telugu | 19.8571 / 7.0337 / 17.6101
194
- Thai | 37.3951 / 17.275 / 28.8796
195
- Tigrinya | 25.321 / 8.0157 / 21.1729
196
- Turkish | 32.9304 / 15.5709 / 29.2622
197
- Ukrainian | 23.9908 / 10.1431 / 20.9199
198
- Urdu | 39.5579 / 18.3733 / 32.8442
199
- Uzbek | 16.8281 / 6.3406 / 15.4055
200
- Vietnamese | 32.8826 / 16.2247 / 26.0844
201
- Welsh | 32.6599 / 11.596 / 26.1164
202
- Yoruba | 31.6595 / 11.6599 / 25.0898
203
-
204
-
205
-
206
- ## Citation
207
-
208
- If you use this model, please cite the following paper:
209
- ```
210
- @inproceedings{hasan-etal-2021-xl,
211
- title = "{XL}-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages",
212
- author = "Hasan, Tahmid and
213
- Bhattacharjee, Abhik and
214
- Islam, Md. Saiful and
215
- Mubasshir, Kazi and
216
- Li, Yuan-Fang and
217
- Kang, Yong-Bin and
218
- Rahman, M. Sohel and
219
- Shahriyar, Rifat",
220
- booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
221
- month = aug,
222
- year = "2021",
223
- address = "Online",
224
- publisher = "Association for Computational Linguistics",
225
- url = "https://aclanthology.org/2021.findings-acl.413",
226
- pages = "4693--4703",
227
- }
228
- ```
 
105
  verified: true
106
  ---
107
 
 
 
 
 
108
 
109
 
110
  ## Using this model in `transformers` (tested on 4.11.0.dev0)
 
145
  print(summary)
146
  ```
147