| <!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| β οΈ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| # DeBERTa[[deberta]] | |
| ## κ°μ[[overview]] | |
| DeBERTa λͺ¨λΈμ Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chenμ΄ μμ±ν [DeBERTa: λΆλ¦¬λ μ΄ν μ μ νμ©ν λμ½λ© κ°ν BERT](https://huggingface.co/papers/2006.03654)μ΄λΌλ λ Όλ¬Έμμ μ μλμμ΅λλ€. μ΄ λͺ¨λΈμ 2018λ Googleμ΄ λ°νν BERT λͺ¨λΈκ³Ό 2019λ Facebookμ΄ λ°νν RoBERTa λͺ¨λΈμ κΈ°λ°μΌλ‘ ν©λλ€. | |
| DeBERTaλ RoBERTaμμ μ¬μ©λ λ°μ΄ν°μ μ λ°λ§μ μ¬μ©νμ¬ λΆλ¦¬λ(disentangled) μ΄ν μ κ³Ό ν₯μλ λ§μ€ν¬ λμ½λ νμ΅μ ν΅ν΄ RoBERTaλ₯Ό κ°μ νμ΅λλ€. | |
| λ Όλ¬Έμ μ΄λ‘μ λ€μκ³Ό κ°μ΅λλ€: | |
| *μ¬μ νμ΅λ μ κ²½λ§ μΈμ΄ λͺ¨λΈμ μ΅κ·Ό λ°μ μ λ§μ μμ°μ΄ μ²λ¦¬(NLP) μμ μ μ±λ₯μ ν¬κ² ν₯μμμΌ°μ΅λλ€. λ³Έ λ Όλ¬Έμμλ λ κ°μ§ μλ‘μ΄ κΈ°μ μ μ¬μ©νμ¬ BERTμ RoBERTa λͺ¨λΈμ κ°μ ν μλ‘μ΄ λͺ¨λΈ κ΅¬μ‘°μΈ DeBERTaλ₯Ό μ μν©λλ€. 첫 λ²μ§Έλ λΆλ¦¬λ μ΄ν μ λ©μ»€λμ¦μΌλ‘, κ° λ¨μ΄κ° λ΄μ©κ³Ό μμΉλ₯Ό κ°κ° μΈμ½λ©νλ λ κ°μ 벑ν°λ‘ ννλλ©°, λ¨μ΄λ€ κ°μ μ΄ν μ κ°μ€μΉλ λ΄μ©κ³Ό μλμ μμΉμ λν λΆλ¦¬λ νλ ¬μ μ¬μ©νμ¬ κ³μ°λ©λλ€. λ λ²μ§Έλ‘, λͺ¨λΈ μ¬μ νμ΅μ μν΄ λ§μ€νΉλ ν ν°μ μμΈ‘νλ μΆλ ₯ μννΈλ§₯μ€ μΈ΅μ λ체νλ ν₯μλ λ§μ€ν¬ λμ½λκ° μ¬μ©λ©λλ€. μ°λ¦¬λ μ΄ λ κ°μ§ κΈ°μ μ΄ λͺ¨λΈ μ¬μ νμ΅μ ν¨μ¨μ±κ³Ό λ€μ΄μ€νΈλ¦Ό μμ μ μ±λ₯μ ν¬κ² ν₯μμν¨λ€λ κ²μ 보μ¬μ€λλ€. RoBERTa-Largeμ λΉκ΅νμ λ, μ λ°μ νμ΅ λ°μ΄ν°λ‘ νμ΅λ DeBERTa λͺ¨λΈμ κ΄λ²μν NLP μμ μμ μΌκ΄λκ² λ λμ μ±λ₯μ 보μ¬μ£Όλ©°, MNLIμμ +0.9%(90.2% vs 91.1%), SQuAD v2.0μμ +2.3%(88.4% vs 90.7%), RACEμμ +3.6%(83.2% vs 86.8%)μ μ±λ₯ ν₯μμ λ¬μ±νμ΅λλ€. DeBERTa μ½λμ μ¬μ νμ΅λ λͺ¨λΈμ https://github.com/microsoft/DeBERTa μμ 곡κ°λ μμ μ λλ€.* | |
| [DeBERTa](https://huggingface.co/DeBERTa) λͺ¨λΈμ ν μνλ‘ 2.0 ꡬνμ [kamalkraj](https://huggingface.co/kamalkraj)κ° κΈ°μ¬νμ΅λλ€. μλ³Έ μ½λλ [μ΄κ³³](https://github.com/microsoft/DeBERTa)μμ νμΈνμ€ μ μμ΅λλ€. | |
| ## 리μμ€[[resources]] | |
| DeBERTaλ₯Ό μμνλ λ° λμμ΄ λλ Hugging Faceμ community μλ£ λͺ©λ‘(πλ‘ νμλ¨) μ λλ€. μ¬κΈ°μ ν¬ν¨λ μλ£λ₯Ό μ μΆνκ³ μΆμΌμλ€λ©΄ PR(Pull Request)λ₯Ό μ΄μ΄μ£ΌμΈμ. λ¦¬λ·°ν΄ λλ¦¬κ² μ΅λλ€! μλ£λ κΈ°μ‘΄ μλ£λ₯Ό 볡μ νλ λμ μλ‘μ΄ λ΄μ©μ λ΄κ³ μμ΄μΌ ν©λλ€. | |
| <PipelineTag pipeline="text-classification"/> | |
| - DeBERTaμ [DeepSpeedλ₯Ό μ΄μ©ν΄μ λν λͺ¨λΈ νμ΅μ κ°μμν€λ](https://huggingface.co/blog/accelerate-deepspeed) λ°©λ²μ λν ν¬μ€νΈ. | |
| - DeBERTaμ [λ¨Έμ λ¬λμΌλ‘ νμΈ΅ ν₯μλ κ³ κ° μλΉμ€](https://huggingface.co/blog/supercharge-customer-service-with-machine-learning)μ λν λΈλ‘κ·Έ ν¬μ€νΈ. | |
| - [`DebertaForSequenceClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)μμ μ§μλ©λλ€. | |
| - [`TFDebertaForSequenceClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/text-classification)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)μμ μ§μλ©λλ€. | |
| - [ν μ€νΈ λΆλ₯ μμ κ°μ΄λ](../tasks/sequence_classification) | |
| <PipelineTag pipeline="token-classification" /> | |
| - [`DebertaForTokenClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/token-classification)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)μμ μ§μν©λλ€. | |
| - [`TFDebertaForTokenClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/token-classification)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)μμ μ§μν©λλ€. | |
| - π€ Hugging Face μ½μ€μ [ν ν° λΆλ₯](https://huggingface.co/course/chapter7/2?fw=pt) μ₯. | |
| - π€ Hugging Face μ½μ€μ [BPE(Byte-Pair Encoding) ν ν°ν](https://huggingface.co/course/chapter6/5?fw=pt) μ₯. | |
| - [ν ν° λΆλ₯ μμ κ°μ΄λ](../tasks/token_classification) | |
| <PipelineTag pipeline="fill-mask"/> | |
| - [`DebertaForMaskedLM`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#robertabertdistilbert-and-masked-language-modeling)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)μμ μ§μν©λλ€. | |
| - [`TFDebertaForMaskedLM`]μ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/language-modeling#run_mlmpy)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)μμ μ§μν©λλ€. | |
| - π€ Hugging Face μ½μ€μ [λ§μ€ν¬ μΈμ΄ λͺ¨λΈλ§](https://huggingface.co/course/chapter7/3?fw=pt) μ₯. | |
| - [λ§μ€ν¬ μΈμ΄ λͺ¨λΈλ§ μμ κ°μ΄λ](../tasks/masked_language_modeling) | |
| <PipelineTag pipeline="question-answering"/> | |
| - [`DebertaForQuestionAnswering`]μ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)μμ μ§μν©λλ€. | |
| - [`TFDebertaForQuestionAnswering`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/question-answering)μ [λ ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)μμ μ§μν©λλ€. | |
| - π€ Hugging Face μ½μ€μ [μ§μμλ΅(Question answering)](https://huggingface.co/course/chapter7/7?fw=pt) μ₯. | |
| - [μ§μμλ΅ μμ κ°μ΄λ](../tasks/question_answering) | |
| ## DebertaConfig[[transformers.DebertaConfig]] | |
| [[autodoc]] DebertaConfig | |
| ## DebertaTokenizer[[transformers.DebertaTokenizer]] | |
| [[autodoc]] DebertaTokenizer | |
| - build_inputs_with_special_tokens | |
| - get_special_tokens_mask | |
| - create_token_type_ids_from_sequences | |
| - save_vocabulary | |
| ## DebertaTokenizerFast[[transformers.DebertaTokenizerFast]] | |
| [[autodoc]] DebertaTokenizerFast | |
| - build_inputs_with_special_tokens | |
| - create_token_type_ids_from_sequences | |
| <frameworkcontent> | |
| <pt> | |
| ## DebertaModel[[transformers.DebertaModel]] | |
| [[autodoc]] DebertaModel | |
| - forward | |
| ## DebertaPreTrainedModel[[transformers.DebertaPreTrainedModel]] | |
| [[autodoc]] DebertaPreTrainedModel | |
| ## DebertaForMaskedLM[[transformers.DebertaForMaskedLM]] | |
| [[autodoc]] DebertaForMaskedLM | |
| - forward | |
| ## DebertaForSequenceClassification[[transformers.DebertaForSequenceClassification]] | |
| [[autodoc]] DebertaForSequenceClassification | |
| - forward | |
| ## DebertaForTokenClassification[[transformers.DebertaForTokenClassification]] | |
| [[autodoc]] DebertaForTokenClassification | |
| - forward | |
| ## DebertaForQuestionAnswering[[transformers.DebertaForQuestionAnswering]] | |
| [[autodoc]] DebertaForQuestionAnswering | |
| - forward | |
| </pt> | |
| <tf> | |
| ## TFDebertaModel[[transformers.TFDebertaModel]] | |
| [[autodoc]] TFDebertaModel | |
| - call | |
| ## TFDebertaPreTrainedModel[[transformers.TFDebertaPreTrainedModel]] | |
| [[autodoc]] TFDebertaPreTrainedModel | |
| - call | |
| ## TFDebertaForMaskedLM[[transformers.TFDebertaForMaskedLM]] | |
| [[autodoc]] TFDebertaForMaskedLM | |
| - call | |
| ## TFDebertaForSequenceClassification[[transformers.TFDebertaForSequenceClassification]] | |
| [[autodoc]] TFDebertaForSequenceClassification | |
| - call | |
| ## TFDebertaForTokenClassification[[transformers.TFDebertaForTokenClassification]] | |
| [[autodoc]] TFDebertaForTokenClassification | |
| - call | |
| ## TFDebertaForQuestionAnswering[[transformers.TFDebertaForQuestionAnswering]] | |
| [[autodoc]] TFDebertaForQuestionAnswering | |
| - call | |
| </tf> | |
| </frameworkcontent> | |