| | --- |
| | license: apache-2.0 |
| | language: en |
| | --- |
| | |
| | # BART (large-sized model) |
| |
|
| | ## Model description |
| |
|
| | BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. |
| |
|
| | BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). |
| |
|
| | Weights shared here are effectively from facebook/bart-large but with added noise for BOS embedding to assist the finetuning. |
| |
|
| | ## Intended uses & limitations |
| |
|
| | There have been quite a few issues related to finetuning BART for text generation, and this repo implements solution discussed in [#15559](https://github.com/huggingface/transformers/issues/15559). |
| | Particularly adding some noise to pre-trained model's BOS embedding. This seems to solve the problem of endless BOS generation for a finetuned BART model. |
| |
|
| | You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the [model hub](https://huggingface.co/models?search=bart) to look for fine-tuned versions on a task that interests you. |
| |
|
| | ### How to use |
| |
|
| | Here is how to use this model in PyTorch: |
| |
|
| | ```python |
| | from transformers import BartTokenizer, BartModel |
| | |
| | tokenizer = BartTokenizer.from_pretrained('vedu/bart-large-perturbed') |
| | model = BartModel.from_pretrained('vedu/bart-large-perturbed') |
| | |
| | inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") |
| | outputs = model(**inputs) |
| | |
| | last_hidden_states = outputs.last_hidden_state |
| | ``` |
| |
|
| | ### BibTeX entry and citation info |
| |
|
| | ```bibtex |
| | @article{DBLP:journals/corr/abs-1910-13461, |
| | author = {Mike Lewis and |
| | Yinhan Liu and |
| | Naman Goyal and |
| | Marjan Ghazvininejad and |
| | Abdelrahman Mohamed and |
| | Omer Levy and |
| | Veselin Stoyanov and |
| | Luke Zettlemoyer}, |
| | title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language |
| | Generation, Translation, and Comprehension}, |
| | journal = {CoRR}, |
| | volume = {abs/1910.13461}, |
| | year = {2019}, |
| | url = {http://arxiv.org/abs/1910.13461}, |
| | eprinttype = {arXiv}, |
| | eprint = {1910.13461}, |
| | timestamp = {Thu, 31 Oct 2019 14:02:26 +0100}, |
| | biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib}, |
| | bibsource = {dblp computer science bibliography, https://dblp.org} |
| | } |
| | ``` |