| <!--Copyright 2021 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| โ ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| *์ด ๋ชจ๋ธ์ 2020-07-28์ ์ถ์๋์์ผ๋ฉฐ 2021-03-30์ Hugging Face Transformers์ ์ถ๊ฐ๋์์ต๋๋ค.* | |
| <div style="float: right;"> | |
| <div class="flex flex-wrap space-x-1"> | |
| <img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white" > | |
| </div> | |
| </div> | |
| # BigBird[[bigbird]] | |
| [BigBird](https://huggingface.co/papers/2007.14062)๋ [BERT](./bert)์ 512ํ ํฐ๊ณผ ๋ฌ๋ฆฌ ์ต๋ 4096ํ ํฐ๊น์ง์ ์ํ์ค ๊ธธ์ด๋ฅผ ์ฒ๋ฆฌํ๋๋ก ์ค๊ณ๋ ํธ๋์คํฌ๋จธ ๋ชจ๋ธ์ ๋๋ค. ๊ธฐ์กด ํธ๋์คํฌ๋จธ๋ค์ ์ํ์ค ๊ธธ์ด๊ฐ ๋์ด๋ ์๋ก ์ดํ ์ ๊ณ์ฐ ๋น์ฉ์ด ๊ธ๊ฒฉํ ์ฆ๊ฐํ์ฌ ๊ธด ์ ๋ ฅ ์ฒ๋ฆฌ์ ์ด๋ ค์์ ๊ฒช์ต๋๋ค. BigBird๋ ํฌ์ ์ดํ ์ ๋ฉ์ปค๋์ฆ์ผ๋ก ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๋๋ฐ, ๋ชจ๋ ํ ํฐ์ ๋์์ ์ดํด๋ณด๋ ๋์ ๋ก์ปฌ ์ดํ ์ , ๋๋ค ์ดํ ์ , ๊ทธ๋ฆฌ๊ณ ๋ช ๊ฐ์ ์ ์ญ ํ ํฐ์ ์กฐํฉํ์ฌ ์ ์ฒด ์ ๋ ฅ์ ํจ์จ์ ์ผ๋ก ์ฒ๋ฆฌํฉ๋๋ค. ์ด๋ฐ ๋ฐฉ์์ ํตํด ๊ณ์ฐ ํจ์จ์ฑ์ ์ ์งํ๋ฉด์๋ ์ํ์ค ์ ์ฒด๋ฅผ ์ถฉ๋ถํ ์ดํดํ ์ ์๊ฒ ๋ฉ๋๋ค. ๋ฐ๋ผ์ BigBird๋ ์ง์์๋ต, ์์ฝ, ์ ์ ์ฒดํ ์์ฉ์ฒ๋ผ ๊ธด ๋ฌธ์๋ฅผ ๋ค๋ฃจ๋ ์์ ์ ํนํ ์ฐ์ํ ์ฑ๋ฅ์ ๋ณด์ ๋๋ค. | |
| ๋ชจ๋ ์๋ณธ BigBird ์ฒดํฌํฌ์ธํธ๋ [Google](https://huggingface.co/google?search_models=bigbird) ์กฐ์ง์์ ์ฐพ์๋ณผ ์ ์์ต๋๋ค. | |
| > [!TIP] | |
| > ์ค๋ฅธ์ชฝ ์ฌ์ด๋๋ฐ์ BigBird ๋ชจ๋ธ๋ค์ ํด๋ฆญํ์ฌ ๋ค์ํ ์ธ์ด ์์ ์ BigBird๋ฅผ ์ ์ฉํ๋ ๋ ๋ง์ ์์๋ฅผ ํ์ธํด๋ณด์ธ์. | |
| ์๋ ์์๋ [`Pipeline`], [`AutoModel`], ๊ทธ๋ฆฌ๊ณ ๋ช ๋ น์ค์์ `[MASK]` ํ ํฐ์ ์์ธกํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค. | |
| <hfoptions id="usage"> | |
| <hfoption id="Pipeline"> | |
| ```py | |
| import torch | |
| from transformers import pipeline | |
| pipeline = pipeline( | |
| task="fill-mask", | |
| model="google/bigbird-roberta-base", | |
| dtype=torch.float16, | |
| device=0 | |
| ) | |
| pipeline("Plants create [MASK] through a process known as photosynthesis.") | |
| ``` | |
| </hfoption> | |
| <hfoption id="AutoModel"> | |
| ```py | |
| import torch | |
| from transformers import AutoModelForMaskedLM, AutoTokenizer | |
| tokenizer = AutoTokenizer.from_pretrained( | |
| "google/bigbird-roberta-base", | |
| ) | |
| model = AutoModelForMaskedLM.from_pretrained( | |
| "google/bigbird-roberta-base", | |
| dtype=torch.float16, | |
| device_map="auto", | |
| ) | |
| inputs = tokenizer("Plants create [MASK] through a process known as photosynthesis.", return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| predictions = outputs.logits | |
| masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1] | |
| predicted_token_id = predictions[0, masked_index].argmax(dim=-1) | |
| predicted_token = tokenizer.decode(predicted_token_id) | |
| print(f"The predicted token is: {predicted_token}") | |
| ``` | |
| </hfoption> | |
| <hfoption id="transformers CLI"> | |
| ```bash | |
| !echo -e "Plants create [MASK] through a process known as photosynthesis." | transformers-cli run --task fill-mask --model google/bigbird-roberta-base --device 0 | |
| ``` | |
| </hfoption> | |
| </hfoptions> | |
| ## ์ฐธ๊ณ ์ฌํญ[[notes]] | |
| - BigBird๋ ์ ๋ ์์น ์๋ฒ ๋ฉ์ ์ฌ์ฉํ๋ฏ๋ก ์ ๋ ฅ์ ์ค๋ฅธ์ชฝ์ ํจ๋ฉํด์ผ ํฉ๋๋ค. | |
| - BigBird๋ `original_full`๊ณผ `block_sparse` ์ดํ ์ ์ ์ง์ํฉ๋๋ค. ์ ๋ ฅ ์ํ์ค ๊ธธ์ด๊ฐ 1024 ๋ฏธ๋ง์ธ ๊ฒฝ์ฐ์๋ ํฌ์ ํจํด์ ์ด์ ์ด ํฌ์ง ์์ผ๋ฏ๋ก `original_full` ์ฌ์ฉ์ ๊ถ์ฅํฉ๋๋ค. | |
| - ํ์ฌ ๊ตฌํ์ 3๋ธ๋ก ์๋์ฐ ํฌ๊ธฐ์ 2๊ฐ์ ์ ์ญ ๋ธ๋ก์ ์ฌ์ฉํ๋ฉฐ, ITC ๊ตฌํ๋ง ์ง์ํ๊ณ `num_random_blocks=0`์ ์ง์ํ์ง ์์ต๋๋ค. | |
| - ์ํ์ค ๊ธธ์ด๋ ๋ธ๋ก ํฌ๊ธฐ๋ก ๋๋์ด๋จ์ด์ ธ์ผ ํฉ๋๋ค. | |
| ## ๋ฆฌ์์ค[[resources]] | |
| - BigBird ์ดํ ์ ๋ฉ์ปค๋์ฆ์ ์์ธํ ์๋ ์๋ฆฌ๋ [BigBird](https://huggingface.co/blog/big-bird) ๋ธ๋ก๊ทธ ํฌ์คํธ๋ฅผ ์ฐธ๊ณ ํ์ธ์. | |
| ## BigBirdConfig[[bigbirdconfig]] | |
| [[autodoc]] BigBirdConfig | |
| ## BigBirdTokenizer[[bigbirdtokenizer]] | |
| [[autodoc]] BigBirdTokenizer | |
| - get_special_tokens_mask | |
| - save_vocabulary | |
| ## BigBirdTokenizerFast[[bigbirdtokenizerfast]] | |
| [[autodoc]] BigBirdTokenizerFast | |
| ## BigBird ํน์ ์ถ๋ ฅ[[bigbird-specific-outputs]] | |
| [[autodoc]] models.big_bird.modeling_big_bird.BigBirdForPreTrainingOutput | |
| ## BigBirdModel[[bigbirdmodel]] | |
| [[autodoc]] BigBirdModel | |
| - forward | |
| ## BigBirdForPreTraining[[bigbirdforpretraining]] | |
| [[autodoc]] BigBirdForPreTraining | |
| - forward | |
| ## BigBirdForCausalLM[[bigbirdforcausallm]] | |
| [[autodoc]] BigBirdForCausalLM | |
| - forward | |
| ## BigBirdForMaskedLM[[bigbirdformaskedlm]] | |
| [[autodoc]] BigBirdForMaskedLM | |
| - forward | |
| ## BigBirdForSequenceClassification[[bigbirdforsequenceclassification]] | |
| [[autodoc]] BigBirdForSequenceClassification | |
| - forward | |
| ## BigBirdForMultipleChoice[[bigbirdformultiplechoice]] | |
| [[autodoc]] BigBirdForMultipleChoice | |
| - forward | |
| ## BigBirdForTokenClassification[[bigbirdfortokenclassification]] | |
| [[autodoc]] BigBirdForTokenClassification | |
| - forward | |
| ## BigBirdForQuestionAnswering[[bigbirdforquestionanswering]] | |
| [[autodoc]] BigBirdForQuestionAnswering | |
| - forward |