| | --- |
| | license: apache-2.0 |
| | language: en |
| | --- |
| | |
| | **NOTE: This is the FP32 version of [Facebook's official bart-large](https://huggingface.co/facebook/bart-large/edit/main/README.md).** |
| |
|
| | # BART (large-sized model) |
| |
|
| | BART model pre-trained on English language. It was introduced in the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461) by Lewis et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/bart). |
| |
|
| | Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team. |
| |
|
| | ## Model description |
| |
|
| | BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. |
| |
|
| | BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). |
| |
|
| | ## Intended uses & limitations |
| |
|
| | You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the [model hub](https://huggingface.co/models?search=bart) to look for fine-tuned versions on a task that interests you. |
| |
|
| | ### How to use |
| |
|
| | Here is how to use this model in PyTorch: |
| |
|
| | ```python |
| | from transformers import BartTokenizer, BartModel |
| | |
| | tokenizer = BartTokenizer.from_pretrained('facebook/bart-large') |
| | model = BartModel.from_pretrained('facebook/bart-large') |
| | |
| | inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") |
| | outputs = model(**inputs) |
| | |
| | last_hidden_states = outputs.last_hidden_state |
| | ``` |
| |
|
| | ### BibTeX entry and citation info |
| |
|
| | ```bibtex |
| | @article{DBLP:journals/corr/abs-1910-13461, |
| | author = {Mike Lewis and |
| | Yinhan Liu and |
| | Naman Goyal and |
| | Marjan Ghazvininejad and |
| | Abdelrahman Mohamed and |
| | Omer Levy and |
| | Veselin Stoyanov and |
| | Luke Zettlemoyer}, |
| | title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language |
| | Generation, Translation, and Comprehension}, |
| | journal = {CoRR}, |
| | volume = {abs/1910.13461}, |
| | year = {2019}, |
| | url = {http://arxiv.org/abs/1910.13461}, |
| | eprinttype = {arXiv}, |
| | eprint = {1910.13461}, |
| | timestamp = {Thu, 31 Oct 2019 14:02:26 +0100}, |
| | biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib}, |
| | bibsource = {dblp computer science bibliography, https://dblp.org} |
| | } |
| | ``` |