| | --- |
| | language: en |
| | license: apache-2.0 |
| | datasets: |
| | - bookcorpus |
| | - wikipedia |
| | - vblagoje/cc_news |
| | --- |
| | |
| | # BigBird base model |
| |
|
| | BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle. |
| |
|
| | It is a pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in this [paper](https://arxiv.org/abs/2007.14062) and first released in this [repository](https://github.com/google-research/bigbird). |
| |
|
| | ## Model description |
| |
|
| | BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts. |
| |
|
| | ## Original implementation |
| |
|
| | Follow [this link](https://huggingface.co/google/bigbird-roberta-base) to see the original implementation. |
| |
|
| | ## How to use |
| |
|
| | Download the model by cloning the repository via `git clone https://huggingface.co/OWG/bigbird-roberta-base`. |
| |
|
| | Then you can use the model with the following code: |
| |
|
| | ```python |
| | from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel |
| | from transformers import BertTokenizer |
| | |
| | tokenizer = BertTokenizer.from_pretrained("google/bigbird-roberta-base") |
| | |
| | options = SessionOptions() |
| | options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL |
| | |
| | session = InferenceSession("path/to/model.onnx", sess_options=options) |
| | session.disable_fallback() |
| | |
| | text = "Replace me by any text you want to encode." |
| | |
| | input_ids = tokenizer(text, return_tensors="pt", return_attention_mask=True) |
| | inputs = {k: v.cpu().detach().numpy() for k, v in input_ids.items()} |
| | |
| | outputs_name = session.get_outputs()[0].name |
| | outputs = session.run(output_names=[outputs_name], input_feed=inputs) |
| | ``` |