Longformer: The Long-Document Transformer
Paper • 2004.05150 • Published • 4
How to use bluenguyen/led-bartpho-word-base-16384 with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("bluenguyen/led-bartpho-word-base-16384")
model = AutoModelForSeq2SeqLM.from_pretrained("bluenguyen/led-bartpho-word-base-16384")This model was initialized from vinai/bartpho-word-base and converted to Allenai's Longformer Encoder-Decoder (LED) based on Longformer: The Long-Document Transformer.
To be able to process 16K tokens, bartpho-word-base's position embedding matrix was simply copied 16 times.
This model is especially interesting for long-range summarization and question answering.
This notebook shows how led model can effectively be fine-tuned on a downstream task.