| <!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| โ ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| # ConvBERT [[convbert]] | |
| <div class="flex flex-wrap space-x-1"> | |
| <a href="https://huggingface.co/models?filter=convbert"> | |
| <img alt="Models" src="https://img.shields.io/badge/All_model_pages-convbert-blueviolet"> | |
| </a> | |
| <a href="https://huggingface.co/spaces/docs-demos/conv-bert-base"> | |
| <img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue"> | |
| </a> | |
| </div> | |
| ## ๊ฐ์ [[overview]] | |
| ConvBERT ๋ชจ๋ธ์ Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan์ ์ํด ์ ์๋์์ผ๋ฉฐ, ์ ์ ๋ ผ๋ฌธ ์ ๋ชฉ์ [ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://huggingface.co/papers/2008.02496)์ ๋๋ค. | |
| ๋ ผ๋ฌธ์ ์ด๋ก์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค: | |
| *BERT์ ๊ทธ ๋ณํ ๋ชจ๋ธ๊ณผ ๊ฐ์ ์ฌ์ ํ์ต๋ ์ธ์ด ๋ชจ๋ธ๋ค์ ์ต๊ทผ ๋ค์ํ ์์ฐ์ด ์ดํด ๊ณผ์ ์์ ๋๋ผ์ด ์ฑ๊ณผ๋ฅผ ์ด๋ฃจ์์ต๋๋ค. ๊ทธ๋ฌ๋ BERT๋ ๊ธ๋ก๋ฒ ์ ํ ์ดํ ์ ๋ธ๋ก์ ํฌ๊ฒ ์์กดํ๊ธฐ ๋๋ฌธ์ ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ด ๋ง๊ณ ๊ณ์ฐ ๋น์ฉ์ด ํฝ๋๋ค. ๋ชจ๋ ์ดํ ์ ํค๋๊ฐ ๊ธ๋ก๋ฒ ๊ด์ ์์ ์ดํ ์ ๋งต์ ์์ฑํ๊ธฐ ์ํด ์ ๋ ฅ ์ํ์ค ์ ์ฒด๋ฅผ ํ์ํ์ง๋ง, ์ผ๋ถ ํค๋๋ ๋ก์ปฌ ์ข ์์ฑ๋ง ํ์ตํ ํ์๊ฐ ์๋ค๋ ๊ฒ์ ๋ฐ๊ฒฌํ์ต๋๋ค. ์ด๋ ๋ถํ์ํ ๊ณ์ฐ์ด ํฌํจ๋์ด ์์์ ์๋ฏธํฉ๋๋ค. ๋ฐ๋ผ์ ์ฐ๋ฆฌ๋ ์ด๋ฌํ self-attention ํค๋๋ค์ ๋์ฒดํ์ฌ ๋ก์ปฌ ์ข ์์ฑ์ ์ง์ ๋ชจ๋ธ๋งํ๊ธฐ ์ํด ์๋ก์ด span ๊ธฐ๋ฐ ๋์ ์ปจ๋ณผ๋ฃจ์ ์ ์ ์ํฉ๋๋ค. ์๋ก์ด ์ปจ๋ณผ๋ฃจ์ ํค๋์ ๋๋จธ์ง self-attention ํค๋๋ค์ด ๊ฒฐํฉํ์ฌ ๊ธ๋ก๋ฒ ๋ฐ ๋ก์ปฌ ๋ฌธ๋งฅ ํ์ต์ ๋ ํจ์จ์ ์ธ ํผํฉ ์ดํ ์ ๋ธ๋ก์ ๊ตฌ์ฑํฉ๋๋ค. ์ฐ๋ฆฌ๋ BERT์ ์ด ํผํฉ ์ดํ ์ ์ค๊ณ๋ฅผ ์ ์ฉํ์ฌ ConvBERT ๋ชจ๋ธ์ ๊ตฌ์ถํ์ต๋๋ค. ์คํ ๊ฒฐ๊ณผ, ConvBERT๋ ๋ค์ํ ๋ค์ด์คํธ๋ฆผ ๊ณผ์ ์์ BERT ๋ฐ ๊ทธ ๋ณํ ๋ชจ๋ธ๋ณด๋ค ๋ ์ฐ์ํ ์ฑ๋ฅ์ ๋ณด์์ผ๋ฉฐ, ํ๋ จ ๋น์ฉ๊ณผ ๋ชจ๋ธ ํ๋ผ๋ฏธํฐ ์๊ฐ ๋ ์ ์์ต๋๋ค. ํนํ ConvBERTbase ๋ชจ๋ธ์ GLUE ์ค์ฝ์ด 86.4๋ฅผ ๋ฌ์ฑํ์ฌ ELECTRAbase๋ณด๋ค 0.7 ๋์ ์ฑ๊ณผ๋ฅผ ๋ณด์ด๋ฉฐ, ํ๋ จ ๋น์ฉ์ 1/4 ์ดํ๋ก ์ค์์ต๋๋ค. ์ฝ๋์ ์ฌ์ ํ์ต๋ ๋ชจ๋ธ์ ๊ณต๊ฐ๋ ์์ ์ ๋๋ค.* | |
| ์ด ๋ชจ๋ธ์ [abhishek](https://huggingface.co/abhishek)์ ์ํด ๊ธฐ์ฌ๋์์ผ๋ฉฐ, ์๋ณธ ๊ตฌํ์ ์ฌ๊ธฐ์์ ์ฐพ์ ์ ์์ต๋๋ค : https://github.com/yitu-opensource/ConvBert | |
| ## ์ฌ์ฉ ํ [[usage-tips]] | |
| ConvBERT ํ๋ จ ํ์ BERT์ ์ ์ฌํฉ๋๋ค. ์ฌ์ฉ ํ์ [BERT ๋ฌธ์](bert).๋ฅผ ์ฐธ๊ณ ํ์ญ์์ค. | |
| ## ๋ฆฌ์์ค [[resources]] | |
| - [ํ ์คํธ ๋ถ๋ฅ ์์ ๊ฐ์ด๋ (Text classification task guide)](../tasks/sequence_classification) | |
| - [ํ ํฐ ๋ถ๋ฅ ์์ ๊ฐ์ด๋ (Token classification task guide)](../tasks/token_classification) | |
| - [์ง์์๋ต ์์ ๊ฐ์ด๋ (Question answering task guide)](../tasks/question_answering) | |
| - [๋ง์คํน๋ ์ธ์ด ๋ชจ๋ธ๋ง ์์ ๊ฐ์ด๋ (Masked language modeling task guide)](../tasks/masked_language_modeling) | |
| - [๋ค์ค ์ ํ ์์ ๊ฐ์ด๋ (Multiple choice task guide)](../tasks/multiple_choice) | |
| ## ConvBertConfig [[transformers.ConvBertConfig]] | |
| [[autodoc]] ConvBertConfig | |
| ## ConvBertTokenizer [[transformers.ConvBertTokenizer]] | |
| [[autodoc]] ConvBertTokenizer | |
| - build_inputs_with_special_tokens | |
| - get_special_tokens_mask | |
| - create_token_type_ids_from_sequences | |
| - save_vocabulary | |
| ## ConvBertTokenizerFast [[transformers.ConvBertTokenizerFast]] | |
| [[autodoc]] ConvBertTokenizerFast | |
| <frameworkcontent> | |
| <pt> | |
| ## ConvBertModel [[transformers.ConvBertModel]] | |
| [[autodoc]] ConvBertModel | |
| - forward | |
| ## ConvBertForMaskedLM [[transformers.ConvBertForMaskedLM]] | |
| [[autodoc]] ConvBertForMaskedLM | |
| - forward | |
| ## ConvBertForSequenceClassification [[transformers.ConvBertForSequenceClassification]] | |
| [[autodoc]] ConvBertForSequenceClassification | |
| - forward | |
| ## ConvBertForMultipleChoice [[transformers.ConvBertForMultipleChoice]] | |
| [[autodoc]] ConvBertForMultipleChoice | |
| - forward | |
| ## ConvBertForTokenClassification [[transformers.ConvBertForTokenClassification]] | |
| [[autodoc]] ConvBertForTokenClassification | |
| - forward | |
| ## ConvBertForQuestionAnswering [[transformers.ConvBertForQuestionAnswering]] | |
| [[autodoc]] ConvBertForQuestionAnswering | |
| - forward | |
| </pt> | |
| <tf> | |
| ## TFConvBertModel [[transformers.TFConvBertModel]] | |
| [[autodoc]] TFConvBertModel | |
| - call | |
| ## TFConvBertForMaskedLM [[transformers.TFConvBertForMaskedLM]] | |
| [[autodoc]] TFConvBertForMaskedLM | |
| - call | |
| ## TFConvBertForSequenceClassification [[transformers.TFConvBertForSequenceClassification]] | |
| [[autodoc]] TFConvBertForSequenceClassification | |
| - call | |
| ## TFConvBertForMultipleChoice [[transformers.TFConvBertForMultipleChoice]] | |
| [[autodoc]] TFConvBertForMultipleChoice | |
| - call | |
| ## TFConvBertForTokenClassification [[transformers.TFConvBertForTokenClassification]] | |
| [[autodoc]] TFConvBertForTokenClassification | |
| - call | |
| ## TFConvBertForQuestionAnswering [[transformers.TFConvBertForQuestionAnswering]] | |
| [[autodoc]] TFConvBertForQuestionAnswering | |
| - call | |
| </tf> | |
| </frameworkcontent> | |