DrDavis's picture
Upload folder using huggingface_hub
17c6d62 verified

DeBERTa[[deberta]]

๊ฐœ์š”[[overview]]

DeBERTa ๋ชจ๋ธ์€ Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen์ด ์ž‘์„ฑํ•œ DeBERTa: ๋ถ„๋ฆฌ๋œ ์–ดํ…์…˜์„ ํ™œ์šฉํ•œ ๋””์ฝ”๋”ฉ ๊ฐ•ํ™” BERT์ด๋ผ๋Š” ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ 2018๋…„ Google์ด ๋ฐœํ‘œํ•œ BERT ๋ชจ๋ธ๊ณผ 2019๋…„ Facebook์ด ๋ฐœํ‘œํ•œ RoBERTa ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. DeBERTa๋Š” RoBERTa์—์„œ ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ์˜ ์ ˆ๋ฐ˜๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„๋ฆฌ๋œ(disentangled) ์–ดํ…์…˜๊ณผ ํ–ฅ์ƒ๋œ ๋งˆ์Šคํฌ ๋””์ฝ”๋” ํ•™์Šต์„ ํ†ตํ•ด RoBERTa๋ฅผ ๊ฐœ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋…ผ๋ฌธ์˜ ์ดˆ๋ก์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

์‚ฌ์ „ ํ•™์Šต๋œ ์‹ ๊ฒฝ๋ง ์–ธ์–ด ๋ชจ๋ธ์˜ ์ตœ๊ทผ ๋ฐœ์ „์€ ๋งŽ์€ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) ์ž‘์—…์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‘ ๊ฐ€์ง€ ์ƒˆ๋กœ์šด ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•˜์—ฌ BERT์™€ RoBERTa ๋ชจ๋ธ์„ ๊ฐœ์„ ํ•œ ์ƒˆ๋กœ์šด ๋ชจ๋ธ ๊ตฌ์กฐ์ธ DeBERTa๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋Š” ๋ถ„๋ฆฌ๋œ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์œผ๋กœ, ๊ฐ ๋‹จ์–ด๊ฐ€ ๋‚ด์šฉ๊ณผ ์œ„์น˜๋ฅผ ๊ฐ๊ฐ ์ธ์ฝ”๋”ฉํ•˜๋Š” ๋‘ ๊ฐœ์˜ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„๋˜๋ฉฐ, ๋‹จ์–ด๋“ค ๊ฐ„์˜ ์–ดํ…์…˜ ๊ฐ€์ค‘์น˜๋Š” ๋‚ด์šฉ๊ณผ ์ƒ๋Œ€์  ์œ„์น˜์— ๋Œ€ํ•œ ๋ถ„๋ฆฌ๋œ ํ–‰๋ ฌ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ, ๋ชจ๋ธ ์‚ฌ์ „ ํ•™์Šต์„ ์œ„ํ•ด ๋งˆ์Šคํ‚น๋œ ํ† ํฐ์„ ์˜ˆ์ธกํ•˜๋Š” ์ถœ๋ ฅ ์†Œํ”„ํŠธ๋งฅ์Šค ์ธต์„ ๋Œ€์ฒดํ•˜๋Š” ํ–ฅ์ƒ๋œ ๋งˆ์Šคํฌ ๋””์ฝ”๋”๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด ๋‘ ๊ฐ€์ง€ ๊ธฐ์ˆ ์ด ๋ชจ๋ธ ์‚ฌ์ „ ํ•™์Šต์˜ ํšจ์œจ์„ฑ๊ณผ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. RoBERTa-Large์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ, ์ ˆ๋ฐ˜์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋œ DeBERTa ๋ชจ๋ธ์€ ๊ด‘๋ฒ”์œ„ํ•œ NLP ์ž‘์—…์—์„œ ์ผ๊ด€๋˜๊ฒŒ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, MNLI์—์„œ +0.9%(90.2% vs 91.1%), SQuAD v2.0์—์„œ +2.3%(88.4% vs 90.7%), RACE์—์„œ +3.6%(83.2% vs 86.8%)์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. DeBERTa ์ฝ”๋“œ์™€ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์€ https://github.com/microsoft/DeBERTa ์—์„œ ๊ณต๊ฐœ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

DeBERTa ๋ชจ๋ธ์˜ ํ…์„œํ”Œ๋กœ 2.0 ๊ตฌํ˜„์€ kamalkraj๊ฐ€ ๊ธฐ์—ฌํ–ˆ์Šต๋‹ˆ๋‹ค. ์›๋ณธ ์ฝ”๋“œ๋Š” ์ด๊ณณ์—์„œ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฆฌ์†Œ์Šค[[resources]]

DeBERTa๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋Š” Hugging Face์™€ community ์ž๋ฃŒ ๋ชฉ๋ก(๐ŸŒŽ๋กœ ํ‘œ์‹œ๋จ) ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— ํฌํ•จ๋  ์ž๋ฃŒ๋ฅผ ์ œ์ถœํ•˜๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด PR(Pull Request)๋ฅผ ์—ด์–ด์ฃผ์„ธ์š”. ๋ฆฌ๋ทฐํ•ด ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค! ์ž๋ฃŒ๋Š” ๊ธฐ์กด ์ž๋ฃŒ๋ฅผ ๋ณต์ œํ•˜๋Š” ๋Œ€์‹  ์ƒˆ๋กœ์šด ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

DebertaConfig[[transformers.DebertaConfig]]

[[autodoc]] DebertaConfig

DebertaTokenizer[[transformers.DebertaTokenizer]]

[[autodoc]] DebertaTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary

DebertaTokenizerFast[[transformers.DebertaTokenizerFast]]

[[autodoc]] DebertaTokenizerFast - build_inputs_with_special_tokens - create_token_type_ids_from_sequences

DebertaModel[[transformers.DebertaModel]]

[[autodoc]] DebertaModel - forward

DebertaPreTrainedModel[[transformers.DebertaPreTrainedModel]]

[[autodoc]] DebertaPreTrainedModel

DebertaForMaskedLM[[transformers.DebertaForMaskedLM]]

[[autodoc]] DebertaForMaskedLM - forward

DebertaForSequenceClassification[[transformers.DebertaForSequenceClassification]]

[[autodoc]] DebertaForSequenceClassification - forward

DebertaForTokenClassification[[transformers.DebertaForTokenClassification]]

[[autodoc]] DebertaForTokenClassification - forward

DebertaForQuestionAnswering[[transformers.DebertaForQuestionAnswering]]

[[autodoc]] DebertaForQuestionAnswering - forward

TFDebertaModel[[transformers.TFDebertaModel]]

[[autodoc]] TFDebertaModel - call

TFDebertaPreTrainedModel[[transformers.TFDebertaPreTrainedModel]]

[[autodoc]] TFDebertaPreTrainedModel - call

TFDebertaForMaskedLM[[transformers.TFDebertaForMaskedLM]]

[[autodoc]] TFDebertaForMaskedLM - call

TFDebertaForSequenceClassification[[transformers.TFDebertaForSequenceClassification]]

[[autodoc]] TFDebertaForSequenceClassification - call

TFDebertaForTokenClassification[[transformers.TFDebertaForTokenClassification]]

[[autodoc]] TFDebertaForTokenClassification - call

TFDebertaForQuestionAnswering[[transformers.TFDebertaForQuestionAnswering]]

[[autodoc]] TFDebertaForQuestionAnswering - call