DrDavis's picture
Upload folder using huggingface_hub
17c6d62 verified

BART[[bart]]

๊ฐœ์š” [[overview]]

Bart ๋ชจ๋ธ์€ 2019๋…„ 10์›” 29์ผ Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer๊ฐ€ ๋ฐœํ‘œํ•œ BART: ์ž์—ฐ์–ด ์ƒ์„ฑ, ๋ฒˆ์—ญ, ์ดํ•ด๋ฅผ ์œ„ํ•œ ์žก์Œ ์ œ๊ฑฐ seq2seq ์‚ฌ์ „ ํ›ˆ๋ จ์ด๋ผ๋Š” ๋…ผ๋ฌธ์—์„œ ์†Œ๊ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๋…ผ๋ฌธ์˜ ์ดˆ๋ก์— ๋”ฐ๋ฅด๋ฉด,

  • Bart๋Š” ์–‘๋ฐฉํ–ฅ ์ธ์ฝ”๋”(BERT์™€ ์œ ์‚ฌ)์™€ ์™ผ์ชฝ์—์„œ ์˜ค๋ฅธ์ชฝ์œผ๋กœ ๋””์ฝ”๋”ฉํ•˜๋Š” ๋””์ฝ”๋”(GPT์™€ ์œ ์‚ฌ)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํ‘œ์ค€ seq2seq/๊ธฐ๊ณ„ ๋ฒˆ์—ญ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์‚ฌ์ „ ํ›ˆ๋ จ ์ž‘์—…์€ ์›๋ž˜ ๋ฌธ์žฅ์˜ ์ˆœ์„œ๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์„ž๊ณ , ํ…์ŠคํŠธ์˜ ์ผ๋ถ€ ๊ตฌ๊ฐ„์„ ๋‹จ์ผ ๋งˆ์Šคํฌ ํ† ํฐ์œผ๋กœ ๋Œ€์ฒดํ•˜๋Š” ์ƒˆ๋กœ์šด ์ธํ•„๋ง(in-filling) ๋ฐฉ์‹์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
  • BART๋Š” ํŠนํžˆ ํ…์ŠคํŠธ ์ƒ์„ฑ์„ ์œ„ํ•œ ๋ฏธ์„ธ ์กฐ์ •์— ํšจ๊ณผ์ ์ด์ง€๋งŒ ์ดํ•ด ์ž‘์—…์—๋„ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. GLUE์™€ SQuAD์—์„œ ๋น„์Šทํ•œ ํ›ˆ๋ จ ๋ฆฌ์†Œ์Šค๋กœ RoBERTa์˜ ์„ฑ๋Šฅ๊ณผ ์ผ์น˜ํ•˜๋ฉฐ, ์ถ”์ƒ์  ๋Œ€ํ™”, ์งˆ์˜์‘๋‹ต, ์š”์•ฝ ์ž‘์—… ๋“ฑ์—์„œ ์ตœ๋Œ€ 6 ROUGE ์ ์ˆ˜์˜ ํ–ฅ์ƒ์„ ๋ณด์ด๋ฉฐ ์ƒˆ๋กœ์šด ์ตœ๊ณ  ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ sshleifer์— ์˜ํ•ด ๊ธฐ์—ฌ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ €์ž์˜ ์ฝ”๋“œ๋Š” ์ด๊ณณ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์‚ฌ์šฉ ํŒ:[[usage-tips]]

  • BART๋Š” ์ ˆ๋Œ€ ์œ„์น˜ ์ž„๋ฒ ๋”ฉ์„ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋ธ์ด๋ฏ€๋กœ ์ผ๋ฐ˜์ ์œผ๋กœ ์ž…๋ ฅ์„ ์™ผ์ชฝ๋ณด๋‹ค๋Š” ์˜ค๋ฅธ์ชฝ์— ํŒจ๋”ฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
  • ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๊ฐ€ ์žˆ๋Š” seq2seq ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ธ์ฝ”๋”์—๋Š” ์†์ƒ๋œ ํ† ํฐ์ด(corrupted tokens) ์ž…๋ ฅ๋˜๊ณ , ๋””์ฝ”๋”์—๋Š” ์›๋ž˜ ํ† ํฐ์ด ์ž…๋ ฅ๋ฉ๋‹ˆ๋‹ค(๋‹จ, ์ผ๋ฐ˜์ ์ธ ํŠธ๋žœ์Šคํฌ๋จธ ๋””์ฝ”๋”์ฒ˜๋Ÿผ ๋ฏธ๋ž˜ ๋‹จ์–ด๋ฅผ ์ˆจ๊ธฐ๋Š” ๋งˆ์Šคํฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค). ์‚ฌ์ „ ํ›ˆ๋ จ ์ž‘์—…์—์„œ ์ธ์ฝ”๋”์— ์ ์šฉ๋˜๋Š” ๋ณ€ํ™˜๋“ค์˜ ๊ตฌ์„ฑ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

์‚ฌ์ „ ํ›ˆ๋ จ ์ž‘์—…์—์„œ ์ธ์ฝ”๋”์— ์ ์šฉ๋˜๋Š” ๋ณ€ํ™˜๋“ค์˜ ๊ตฌ์„ฑ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • ๋ฌด์ž‘์œ„ ํ† ํฐ ๋งˆ์Šคํ‚น (BERT ์ฒ˜๋Ÿผ)
  • ๋ฌด์ž‘์œ„ ํ† ํฐ ์‚ญ์ œ
  • k๊ฐœ ํ† ํฐ์˜ ๋ฒ”์œ„๋ฅผ ๋‹จ์ผ ๋งˆ์Šคํฌ ํ† ํฐ์œผ๋กœ ๋งˆ์Šคํ‚น (0๊ฐœ ํ† ํฐ์˜ ๋ฒ”์œ„๋Š” ๋งˆ์Šคํฌ ํ† ํฐ์˜ ์‚ฝ์ž…์„ ์˜๋ฏธ)
  • ๋ฌธ์žฅ ์ˆœ์„œ ๋’ค์„ž๊ธฐ
  • ํŠน์ • ํ† ํฐ์—์„œ ์‹œ์ž‘ํ•˜๋„๋ก ๋ฌธ์„œ ํšŒ์ „

๊ตฌํ˜„ ๋…ธํŠธ[[implementation-notes]]

  • Bart๋Š” ์‹œํ€€์Šค ๋ถ„๋ฅ˜์— token_type_ids๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ ์ ˆํ•˜๊ฒŒ ๋‚˜๋ˆ„๊ธฐ ์œ„ํ•ด์„œ [BartTokenizer]๋‚˜ [~BartTokenizer.encode]๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • [BartModel]์˜ ์ •๋ฐฉํ–ฅ ์ „๋‹ฌ์€ decoder_input_ids๊ฐ€ ์ „๋‹ฌ๋˜์ง€ ์•Š์œผ๋ฉด decoder_input_ids๋ฅผ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋‹ค๋ฅธ ์ผ๋ถ€ ๋ชจ๋ธ๋ง API์™€ ๋‹ค๋ฅธ ์ ์ž…๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์˜ ์ผ๋ฐ˜์ ์ธ ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋งˆ์Šคํฌ ์ฑ„์šฐ๊ธฐ(mask filling)์ž…๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ ์˜ˆ์ธก์€ forced_bos_token_id=0์ผ ๋•Œ ๊ธฐ์กด ๊ตฌํ˜„๊ณผ ๋™์ผํ•˜๊ฒŒ ์ž‘๋™ํ•˜๋„๋ก ์˜๋„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ, [fairseq.encode]์— ์ „๋‹ฌํ•˜๋Š” ๋ฌธ์ž์—ด์ด ๊ณต๋ฐฑ์œผ๋กœ ์‹œ์ž‘ํ•  ๋•Œ๋งŒ ์ด ๊ธฐ๋Šฅ์ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
  • [~generation.GenerationMixin.generate]๋Š” ์š”์•ฝ๊ณผ ๊ฐ™์€ ์กฐ๊ฑด๋ถ€ ์ƒ์„ฑ ์ž‘์—…์— ์‚ฌ์šฉ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ํ•ด๋‹น ๋ฌธ์„œ์˜ ์˜ˆ์ œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
  • facebook/bart-large-cnn ๊ฐ€์ค‘์น˜๋ฅผ ๋กœ๋“œํ•˜๋Š” ๋ชจ๋ธ์€ mask_token_id๊ฐ€ ์—†๊ฑฐ๋‚˜, ๋งˆ์Šคํฌ ์ฑ„์šฐ๊ธฐ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

๋งˆ์Šคํฌ ์ฑ„์šฐ๊ธฐ[[mask-filling]]

facebook/bart-base์™€ facebook/bart-large ์ฒดํฌํฌ์ธํŠธ๋Š” ๋ฉ€ํ‹ฐ ํ† ํฐ ๋งˆ์Šคํฌ๋ฅผ ์ฑ„์šฐ๋Š”๋ฐ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from transformers import BartForConditionalGeneration, BartTokenizer

model = BartForConditionalGeneration.from_pretrained("facebook/bart-large", forced_bos_token_id=0)
tok = BartTokenizer.from_pretrained("facebook/bart-large")
example_english_phrase = "UN Chief Says There Is No <mask> in Syria"
batch = tok(example_english_phrase, return_tensors="pt")
generated_ids = model.generate(batch["input_ids"])
assert tok.batch_decode(generated_ids, skip_special_tokens=True) == [
    "UN Chief Says There Is No Plan to Stop Chemical Weapons in Syria"
]

์ž๋ฃŒ[[resources]]

BART๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋Š” Hugging Face์™€ community ์ž๋ฃŒ ๋ชฉ๋ก(๐ŸŒŽ๋กœ ํ‘œ์‹œ๋จ) ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— ํฌํ•จ๋  ์ž๋ฃŒ๋ฅผ ์ œ์ถœํ•˜๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด PR(Pull Request)๋ฅผ ์—ด์–ด์ฃผ์„ธ์š”. ๋ฆฌ๋ทฐ ํ•ด๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค! ์ž๋ฃŒ๋Š” ๊ธฐ์กด ์ž๋ฃŒ๋ฅผ ๋ณต์ œํ•˜๋Š” ๋Œ€์‹  ์ƒˆ๋กœ์šด ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ถ”๊ฐ€์ ์œผ๋กœ ๋ณผ ๊ฒƒ๋“ค:

BartConfig[[transformers.BartConfig]]

[[autodoc]] BartConfig - all

BartTokenizer[[transformers.BartTokenizer]]

[[autodoc]] BartTokenizer - all

BartTokenizerFast[[transformers.BartTokenizerFast]]

[[autodoc]] BartTokenizerFast - all

BartModel[[transformers.BartModel]]

[[autodoc]] BartModel - forward

BartForConditionalGeneration[[transformers.BartForConditionalGeneration]]

[[autodoc]] BartForConditionalGeneration - forward

BartForSequenceClassification[[transformers.BartForSequenceClassification]]

[[autodoc]] BartForSequenceClassification - forward

BartForQuestionAnswering[[transformers.BartForQuestionAnswering]]

[[autodoc]] BartForQuestionAnswering - forward

BartForCausalLM[[transformers.BartForCausalLM]]

[[autodoc]] BartForCausalLM - forward

TFBartModel[[transformers.TFBartModel]]

[[autodoc]] TFBartModel - call

TFBartForConditionalGeneration[[transformers.TFBartForConditionalGeneration]]

[[autodoc]] TFBartForConditionalGeneration - call

TFBartForSequenceClassification[[transformers.TFBartForSequenceClassification]]

[[autodoc]] TFBartForSequenceClassification - call

FlaxBartModel[[transformers.FlaxBartModel]]

[[autodoc]] FlaxBartModel - call - encode - decode

FlaxBartForConditionalGeneration[[transformers.FlaxBartForConditionalGeneration]]

[[autodoc]] FlaxBartForConditionalGeneration - call - encode - decode

FlaxBartForSequenceClassification[[transformers.FlaxBartForSequenceClassification]]

[[autodoc]] FlaxBartForSequenceClassification - call - encode - decode

FlaxBartForQuestionAnswering[[transformers.FlaxBartForQuestionAnswering]]

[[autodoc]] FlaxBartForQuestionAnswering - call - encode - decode

FlaxBartForCausalLM[[transformers.FlaxBartForCausalLM]]

[[autodoc]] FlaxBartForCausalLM - call