DrDavis's picture
Upload folder using huggingface_hub
17c6d62 verified

ConvBERT [[convbert]]

๊ฐœ์š” [[overview]]

ConvBERT ๋ชจ๋ธ์€ Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan์— ์˜ํ•ด ์ œ์•ˆ๋˜์—ˆ์œผ๋ฉฐ, ์ œ์•ˆ ๋…ผ๋ฌธ ์ œ๋ชฉ์€ ConvBERT: Improving BERT with Span-based Dynamic Convolution์ž…๋‹ˆ๋‹ค.

๋…ผ๋ฌธ์˜ ์ดˆ๋ก์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

BERT์™€ ๊ทธ ๋ณ€ํ˜• ๋ชจ๋ธ๊ณผ ๊ฐ™์€ ์‚ฌ์ „ ํ•™์Šต๋œ ์–ธ์–ด ๋ชจ๋ธ๋“ค์€ ์ตœ๊ทผ ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ดํ•ด ๊ณผ์ œ์—์„œ ๋†€๋ผ์šด ์„ฑ๊ณผ๋ฅผ ์ด๋ฃจ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ BERT๋Š” ๊ธ€๋กœ๋ฒŒ ์…€ํ”„ ์–ดํ…์…˜ ๋ธ”๋ก์— ํฌ๊ฒŒ ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๋งŽ๊ณ  ๊ณ„์‚ฐ ๋น„์šฉ์ด ํฝ๋‹ˆ๋‹ค. ๋ชจ๋“  ์–ดํ…์…˜ ํ—ค๋“œ๊ฐ€ ๊ธ€๋กœ๋ฒŒ ๊ด€์ ์—์„œ ์–ดํ…์…˜ ๋งต์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ์ž…๋ ฅ ์‹œํ€€์Šค ์ „์ฒด๋ฅผ ํƒ์ƒ‰ํ•˜์ง€๋งŒ, ์ผ๋ถ€ ํ—ค๋“œ๋Š” ๋กœ์ปฌ ์ข…์†์„ฑ๋งŒ ํ•™์Šตํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ฐœ๊ฒฌํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋ถˆํ•„์š”ํ•œ ๊ณ„์‚ฐ์ด ํฌํ•จ๋˜์–ด ์žˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ self-attention ํ—ค๋“œ๋“ค์„ ๋Œ€์ฒดํ•˜์—ฌ ๋กœ์ปฌ ์ข…์†์„ฑ์„ ์ง์ ‘ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„ํ•ด ์ƒˆ๋กœ์šด span ๊ธฐ๋ฐ˜ ๋™์  ์ปจ๋ณผ๋ฃจ์…˜์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ์ปจ๋ณผ๋ฃจ์…˜ ํ—ค๋“œ์™€ ๋‚˜๋จธ์ง€ self-attention ํ—ค๋“œ๋“ค์ด ๊ฒฐํ•ฉํ•˜์—ฌ ๊ธ€๋กœ๋ฒŒ ๋ฐ ๋กœ์ปฌ ๋ฌธ๋งฅ ํ•™์Šต์— ๋” ํšจ์œจ์ ์ธ ํ˜ผํ•ฉ ์–ดํ…์…˜ ๋ธ”๋ก์„ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” BERT์— ์ด ํ˜ผํ•ฉ ์–ดํ…์…˜ ์„ค๊ณ„๋ฅผ ์ ์šฉํ•˜์—ฌ ConvBERT ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ConvBERT๋Š” ๋‹ค์–‘ํ•œ ๋‹ค์šด์ŠคํŠธ๋ฆผ ๊ณผ์ œ์—์„œ BERT ๋ฐ ๊ทธ ๋ณ€ํ˜• ๋ชจ๋ธ๋ณด๋‹ค ๋” ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋ฉฐ, ํ›ˆ๋ จ ๋น„์šฉ๊ณผ ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ ๋” ์ ์—ˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ConvBERTbase ๋ชจ๋ธ์€ GLUE ์Šค์ฝ”์–ด 86.4๋ฅผ ๋‹ฌ์„ฑํ•˜์—ฌ ELECTRAbase๋ณด๋‹ค 0.7 ๋†’์€ ์„ฑ๊ณผ๋ฅผ ๋ณด์ด๋ฉฐ, ํ›ˆ๋ จ ๋น„์šฉ์€ 1/4 ์ดํ•˜๋กœ ์ค„์—ˆ์Šต๋‹ˆ๋‹ค. ์ฝ”๋“œ์™€ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์€ ๊ณต๊ฐœ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ abhishek์— ์˜ํ•ด ๊ธฐ์—ฌ๋˜์—ˆ์œผ๋ฉฐ, ์›๋ณธ ๊ตฌํ˜„์€ ์—ฌ๊ธฐ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค : https://github.com/yitu-opensource/ConvBert

์‚ฌ์šฉ ํŒ [[usage-tips]]

ConvBERT ํ›ˆ๋ จ ํŒ์€ BERT์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ ํŒ์€ BERT ๋ฌธ์„œ.๋ฅผ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.

๋ฆฌ์†Œ์Šค [[resources]]

ConvBertConfig [[transformers.ConvBertConfig]]

[[autodoc]] ConvBertConfig

ConvBertTokenizer [[transformers.ConvBertTokenizer]]

[[autodoc]] ConvBertTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary

ConvBertTokenizerFast [[transformers.ConvBertTokenizerFast]]

[[autodoc]] ConvBertTokenizerFast

ConvBertModel [[transformers.ConvBertModel]]

[[autodoc]] ConvBertModel - forward

ConvBertForMaskedLM [[transformers.ConvBertForMaskedLM]]

[[autodoc]] ConvBertForMaskedLM - forward

ConvBertForSequenceClassification [[transformers.ConvBertForSequenceClassification]]

[[autodoc]] ConvBertForSequenceClassification - forward

ConvBertForMultipleChoice [[transformers.ConvBertForMultipleChoice]]

[[autodoc]] ConvBertForMultipleChoice - forward

ConvBertForTokenClassification [[transformers.ConvBertForTokenClassification]]

[[autodoc]] ConvBertForTokenClassification - forward

ConvBertForQuestionAnswering [[transformers.ConvBertForQuestionAnswering]]

[[autodoc]] ConvBertForQuestionAnswering - forward

TFConvBertModel [[transformers.TFConvBertModel]]

[[autodoc]] TFConvBertModel - call

TFConvBertForMaskedLM [[transformers.TFConvBertForMaskedLM]]

[[autodoc]] TFConvBertForMaskedLM - call

TFConvBertForSequenceClassification [[transformers.TFConvBertForSequenceClassification]]

[[autodoc]] TFConvBertForSequenceClassification - call

TFConvBertForMultipleChoice [[transformers.TFConvBertForMultipleChoice]]

[[autodoc]] TFConvBertForMultipleChoice - call

TFConvBertForTokenClassification [[transformers.TFConvBertForTokenClassification]]

[[autodoc]] TFConvBertForTokenClassification - call

TFConvBertForQuestionAnswering [[transformers.TFConvBertForQuestionAnswering]]

[[autodoc]] TFConvBertForQuestionAnswering - call