vedu
/

bart-large-perturbed

Feature Extraction

Model card Files Files and versions

bart-large-perturbed / README.md

vedu's picture

Update README.md

60e244e over 2 years ago

|

2.6 kB

	---
	license: apache-2.0
	language: en
	---

	# BART (large-sized model)

	## Model description

	BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.

	BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering).

	Weights shared here are effectively from facebook/bart-large but with added noise for BOS embedding to assist the finetuning.

	## Intended uses & limitations

	There have been quite a few issues related to finetuning BART for text generation, and this repo implements solution discussed in [#15559](https://github.com/huggingface/transformers/issues/15559).
	Particularly adding some noise to pre-trained model's BOS embedding. This seems to solve the problem of endless BOS generation for a finetuned BART model.

	You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the [model hub](https://huggingface.co/models?search=bart) to look for fine-tuned versions on a task that interests you.

	### How to use

	Here is how to use this model in PyTorch:

	```python
	from transformers import BartTokenizer, BartModel

	tokenizer = BartTokenizer.from_pretrained('vedu/bart-large-perturbed')
	model = BartModel.from_pretrained('vedu/bart-large-perturbed')

	inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
	outputs = model(**inputs)

	last_hidden_states = outputs.last_hidden_state
	```

	### BibTeX entry and citation info

	```bibtex
	@article{DBLP:journals/corr/abs-1910-13461,
	author = {Mike Lewis and
	Yinhan Liu and
	Naman Goyal and
	Marjan Ghazvininejad and
	Abdelrahman Mohamed and
	Omer Levy and
	Veselin Stoyanov and
	Luke Zettlemoyer},
	title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language
	Generation, Translation, and Comprehension},
	journal = {CoRR},
	volume = {abs/1910.13461},
	year = {2019},
	url = {http://arxiv.org/abs/1910.13461},
	eprinttype = {arXiv},
	eprint = {1910.13461},
	timestamp = {Thu, 31 Oct 2019 14:02:26 +0100},
	biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib},
	bibsource = {dblp computer science bibliography, https://dblp.org}
	}
	```