File size: 9,079 Bytes
17c6d62 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
β οΈ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# DeBERTa[[deberta]]
## κ°μ[[overview]]
DeBERTa λͺ¨λΈμ Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chenμ΄ μμ±ν [DeBERTa: λΆλ¦¬λ μ΄ν
μ
μ νμ©ν λμ½λ© κ°ν BERT](https://arxiv.org/abs/2006.03654)μ΄λΌλ λ
Όλ¬Έμμ μ μλμμ΅λλ€. μ΄ λͺ¨λΈμ 2018λ
Googleμ΄ λ°νν BERT λͺ¨λΈκ³Ό 2019λ
Facebookμ΄ λ°νν RoBERTa λͺ¨λΈμ κΈ°λ°μΌλ‘ ν©λλ€.
DeBERTaλ RoBERTaμμ μ¬μ©λ λ°μ΄ν°μ μ λ°λ§μ μ¬μ©νμ¬ λΆλ¦¬λ(disentangled) μ΄ν
μ
κ³Ό ν₯μλ λ§μ€ν¬ λμ½λ νμ΅μ ν΅ν΄ RoBERTaλ₯Ό κ°μ νμ΅λλ€.
λ
Όλ¬Έμ μ΄λ‘μ λ€μκ³Ό κ°μ΅λλ€:
*μ¬μ νμ΅λ μ κ²½λ§ μΈμ΄ λͺ¨λΈμ μ΅κ·Ό λ°μ μ λ§μ μμ°μ΄ μ²λ¦¬(NLP) μμ
μ μ±λ₯μ ν¬κ² ν₯μμμΌ°μ΅λλ€. λ³Έ λ
Όλ¬Έμμλ λ κ°μ§ μλ‘μ΄ κΈ°μ μ μ¬μ©νμ¬ BERTμ RoBERTa λͺ¨λΈμ κ°μ ν μλ‘μ΄ λͺ¨λΈ κ΅¬μ‘°μΈ DeBERTaλ₯Ό μ μν©λλ€. 첫 λ²μ§Έλ λΆλ¦¬λ μ΄ν
μ
λ©μ»€λμ¦μΌλ‘, κ° λ¨μ΄κ° λ΄μ©κ³Ό μμΉλ₯Ό κ°κ° μΈμ½λ©νλ λ κ°μ 벑ν°λ‘ ννλλ©°, λ¨μ΄λ€ κ°μ μ΄ν
μ
κ°μ€μΉλ λ΄μ©κ³Ό μλμ μμΉμ λν λΆλ¦¬λ νλ ¬μ μ¬μ©νμ¬ κ³μ°λ©λλ€. λ λ²μ§Έλ‘, λͺ¨λΈ μ¬μ νμ΅μ μν΄ λ§μ€νΉλ ν ν°μ μμΈ‘νλ μΆλ ₯ μννΈλ§₯μ€ μΈ΅μ λ체νλ ν₯μλ λ§μ€ν¬ λμ½λκ° μ¬μ©λ©λλ€. μ°λ¦¬λ μ΄ λ κ°μ§ κΈ°μ μ΄ λͺ¨λΈ μ¬μ νμ΅μ ν¨μ¨μ±κ³Ό λ€μ΄μ€νΈλ¦Ό μμ
μ μ±λ₯μ ν¬κ² ν₯μμν¨λ€λ κ²μ 보μ¬μ€λλ€. RoBERTa-Largeμ λΉκ΅νμ λ, μ λ°μ νμ΅ λ°μ΄ν°λ‘ νμ΅λ DeBERTa λͺ¨λΈμ κ΄λ²μν NLP μμ
μμ μΌκ΄λκ² λ λμ μ±λ₯μ 보μ¬μ£Όλ©°, MNLIμμ +0.9%(90.2% vs 91.1%), SQuAD v2.0μμ +2.3%(88.4% vs 90.7%), RACEμμ +3.6%(83.2% vs 86.8%)μ μ±λ₯ ν₯μμ λ¬μ±νμ΅λλ€. DeBERTa μ½λμ μ¬μ νμ΅λ λͺ¨λΈμ https://github.com/microsoft/DeBERTa μμ 곡κ°λ μμ μ
λλ€.*
[DeBERTa](https://huggingface.co/DeBERTa) λͺ¨λΈμ ν
μνλ‘ 2.0 ꡬνμ [kamalkraj](https://huggingface.co/kamalkraj)κ° κΈ°μ¬νμ΅λλ€. μλ³Έ μ½λλ [μ΄κ³³](https://github.com/microsoft/DeBERTa)μμ νμΈνμ€ μ μμ΅λλ€.
## 리μμ€[[resources]]
DeBERTaλ₯Ό μμνλ λ° λμμ΄ λλ Hugging Faceμ community μλ£ λͺ©λ‘(πλ‘ νμλ¨) μ
λλ€. μ¬κΈ°μ ν¬ν¨λ μλ£λ₯Ό μ μΆνκ³ μΆμΌμλ€λ©΄ PR(Pull Request)λ₯Ό μ΄μ΄μ£ΌμΈμ. λ¦¬λ·°ν΄ λλ¦¬κ² μ΅λλ€! μλ£λ κΈ°μ‘΄ μλ£λ₯Ό 볡μ νλ λμ μλ‘μ΄ λ΄μ©μ λ΄κ³ μμ΄μΌ ν©λλ€.
<PipelineTag pipeline="text-classification"/>
- DeBERTaμ [DeepSpeedλ₯Ό μ΄μ©ν΄μ λν λͺ¨λΈ νμ΅μ κ°μμν€λ](https://huggingface.co/blog/accelerate-deepspeed) λ°©λ²μ λν ν¬μ€νΈ.
- DeBERTaμ [λ¨Έμ λ¬λμΌλ‘ νμΈ΅ ν₯μλ κ³ κ° μλΉμ€](https://huggingface.co/blog/supercharge-customer-service-with-machine-learning)μ λν λΈλ‘κ·Έ ν¬μ€νΈ.
- [`DebertaForSequenceClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)μμ μ§μλ©λλ€.
- [`TFDebertaForSequenceClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/text-classification)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb)μμ μ§μλ©λλ€.
- [ν
μ€νΈ λΆλ₯ μμ
κ°μ΄λ](../tasks/sequence_classification)
<PipelineTag pipeline="token-classification" />
- [`DebertaForTokenClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/token-classification)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification.ipynb)μμ μ§μν©λλ€.
- [`TFDebertaForTokenClassification`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/token-classification)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)μμ μ§μν©λλ€.
- π€ Hugging Face μ½μ€μ [ν ν° λΆλ₯](https://huggingface.co/course/chapter7/2?fw=pt) μ₯.
- π€ Hugging Face μ½μ€μ [BPE(Byte-Pair Encoding) ν ν°ν](https://huggingface.co/course/chapter6/5?fw=pt) μ₯.
- [ν ν° λΆλ₯ μμ
κ°μ΄λ](../tasks/token_classification)
<PipelineTag pipeline="fill-mask"/>
- [`DebertaForMaskedLM`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#robertabertdistilbert-and-masked-language-modeling)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)μμ μ§μν©λλ€.
- [`TFDebertaForMaskedLM`]μ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/language-modeling#run_mlmpy)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)μμ μ§μν©λλ€.
- π€ Hugging Face μ½μ€μ [λ§μ€ν¬ μΈμ΄ λͺ¨λΈλ§](https://huggingface.co/course/chapter7/3?fw=pt) μ₯.
- [λ§μ€ν¬ μΈμ΄ λͺ¨λΈλ§ μμ
κ°μ΄λ](../tasks/masked_language_modeling)
<PipelineTag pipeline="question-answering"/>
- [`DebertaForQuestionAnswering`]μ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)μμ μ§μν©λλ€.
- [`TFDebertaForQuestionAnswering`]λ μ΄ [μμ μ€ν¬λ¦½νΈ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/question-answering)μ [λ
ΈνΈλΆ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)μμ μ§μν©λλ€.
- π€ Hugging Face μ½μ€μ [μ§μμλ΅(Question answering)](https://huggingface.co/course/chapter7/7?fw=pt) μ₯.
- [μ§μμλ΅ μμ
κ°μ΄λ](../tasks/question_answering)
## DebertaConfig[[transformers.DebertaConfig]]
[[autodoc]] DebertaConfig
## DebertaTokenizer[[transformers.DebertaTokenizer]]
[[autodoc]] DebertaTokenizer
- build_inputs_with_special_tokens
- get_special_tokens_mask
- create_token_type_ids_from_sequences
- save_vocabulary
## DebertaTokenizerFast[[transformers.DebertaTokenizerFast]]
[[autodoc]] DebertaTokenizerFast
- build_inputs_with_special_tokens
- create_token_type_ids_from_sequences
<frameworkcontent>
<pt>
## DebertaModel[[transformers.DebertaModel]]
[[autodoc]] DebertaModel
- forward
## DebertaPreTrainedModel[[transformers.DebertaPreTrainedModel]]
[[autodoc]] DebertaPreTrainedModel
## DebertaForMaskedLM[[transformers.DebertaForMaskedLM]]
[[autodoc]] DebertaForMaskedLM
- forward
## DebertaForSequenceClassification[[transformers.DebertaForSequenceClassification]]
[[autodoc]] DebertaForSequenceClassification
- forward
## DebertaForTokenClassification[[transformers.DebertaForTokenClassification]]
[[autodoc]] DebertaForTokenClassification
- forward
## DebertaForQuestionAnswering[[transformers.DebertaForQuestionAnswering]]
[[autodoc]] DebertaForQuestionAnswering
- forward
</pt>
<tf>
## TFDebertaModel[[transformers.TFDebertaModel]]
[[autodoc]] TFDebertaModel
- call
## TFDebertaPreTrainedModel[[transformers.TFDebertaPreTrainedModel]]
[[autodoc]] TFDebertaPreTrainedModel
- call
## TFDebertaForMaskedLM[[transformers.TFDebertaForMaskedLM]]
[[autodoc]] TFDebertaForMaskedLM
- call
## TFDebertaForSequenceClassification[[transformers.TFDebertaForSequenceClassification]]
[[autodoc]] TFDebertaForSequenceClassification
- call
## TFDebertaForTokenClassification[[transformers.TFDebertaForTokenClassification]]
[[autodoc]] TFDebertaForTokenClassification
- call
## TFDebertaForQuestionAnswering[[transformers.TFDebertaForQuestionAnswering]]
[[autodoc]] TFDebertaForQuestionAnswering
- call
</tf>
</frameworkcontent>
|