DrDavis's picture
Upload folder using huggingface_hub
17c6d62 verified

OpenAI GPT [[openai-gpt]]

๊ฐœ์š” [[overview]]

OpenAI GPT ๋ชจ๋ธ์€ Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever๊ฐ€ ์ž‘์„ฑํ•œ Improving Language Understanding by Generative Pre-Training ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” Toronto Book Corpus์™€ ๊ฐ™์€ ์žฅ๊ธฐ ์˜์กด์„ฑ์„ ๊ฐ€์ง„ ๋Œ€๊ทœ๋ชจ ๋ง๋ญ‰์น˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์–ธ์–ด ๋ชจ๋ธ๋ง์œผ๋กœ ์‚ฌ์ „ ํ•™์Šต๋œ ์ธ๊ณผ์ (๋‹จ๋ฐฉํ–ฅ) ํŠธ๋žœ์Šคํฌ๋จธ์ž…๋‹ˆ๋‹ค.

๋…ผ๋ฌธ์˜ ์ดˆ๋ก์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

์ž์—ฐ์–ด ์ดํ•ด๋Š” ํ…์ŠคํŠธ ํ•จ์˜, ์งˆ๋ฌธ ์‘๋‹ต, ์˜๋ฏธ ์œ ์‚ฌ์„ฑ ํ‰๊ฐ€, ๋ฌธ์„œ ๋ถ„๋ฅ˜์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ž‘์—…์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค. ๋น„๋ก ๋Œ€๊ทœ๋ชจ์˜ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ํ…์ŠคํŠธ ๋ง๋ญ‰์น˜๊ฐ€ ํ’๋ถ€ํ•˜๊ธฐ๋Š” ํ•˜์ง€๋งŒ, ์ด๋Ÿฌํ•œ ํŠน์ • ์ž‘์—…์— ๋Œ€ํ•œ ํ•™์Šต์„ ์œ„ํ•œ ๋ ˆ์ด๋ธ”๋œ ๋ฐ์ดํ„ฐ๋Š” ๋ถ€์กฑํ•˜์—ฌ ํŒ๋ณ„์ ์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ์ด ์ ์ ˆํ•˜๊ฒŒ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋‹ค์–‘ํ•œ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ํ…์ŠคํŠธ ๋ง๋ญ‰์น˜์— ๋Œ€ํ•œ ์–ธ์–ด ๋ชจ๋ธ์˜ ์ƒ์„ฑ์  ์‚ฌ์ „ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜๊ณ , ๊ฐ ํŠน์ • ๊ณผ์ œ์— ๋Œ€ํ•œ ํŒ๋ณ„์  ๋ฏธ์„ธ ์กฐ์ •์„ ์ˆ˜ํ–‰ํ•จ์œผ๋กœ์จ ์ด๋Ÿฌํ•œ ๊ณผ์ œ์—์„œ ํฐ ์„ฑ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์ด์ „ ์ ‘๊ทผ ๋ฐฉ์‹๊ณผ ๋‹ฌ๋ฆฌ, ์šฐ๋ฆฌ๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์— ์ตœ์†Œํ•œ์˜ ๋ณ€ํ™”๋ฅผ ์š”๊ตฌํ•˜๋ฉด์„œ ํšจ๊ณผ์ ์ธ ์ „์ด๋ฅผ ๋‹ฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ๋ฏธ์„ธ ์กฐ์ • ์ค‘์— ๊ณผ์ œ ์ธ์‹ ์ž…๋ ฅ ๋ณ€ํ™˜(task-aware input transformation)์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ž์—ฐ์–ด ์ดํ•ด๋ฅผ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ๋ฒค์น˜๋งˆํฌ์—์„œ ์šฐ๋ฆฌ์˜ ์ ‘๊ทผ ๋ฐฉ์‹์˜ ํšจ๊ณผ๋ฅผ ์ž…์ฆํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ์˜ general task-agnostic ๋ชจ๋ธ์€ ๊ฐ ๊ณผ์ œ์— ํŠน๋ณ„ํžˆ ์„ค๊ณ„๋œ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํŒ๋ณ„์ ์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ๋ณด๋‹ค ๋›ฐ์–ด๋‚˜๋ฉฐ, ์—ฐ๊ตฌ๋œ 12๊ฐœ ๊ณผ์ œ ์ค‘ 9๊ฐœ ๋ถ€๋ฌธ์—์„œ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ(state of the art)์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

Write With Transformer๋Š” Hugging Face๊ฐ€ ๋งŒ๋“  ์›น ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์œผ๋กœ, ์—ฌ๋Ÿฌ ๋ชจ๋ธ์˜ ์ƒ์„ฑ ๋Šฅ๋ ฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ ๊ทธ ์ค‘์—๋Š” GPT๋„ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ thomwolf์— ์˜ํ•ด ๊ธฐ์—ฌ๋˜์—ˆ์œผ๋ฉฐ, ์›๋ณธ ์ฝ”๋“œ๋Š” ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์‚ฌ์šฉ ํŒ [[usage-tips]]

  • GPT๋Š” ์ ˆ๋Œ€ ์œ„์น˜ ์ž„๋ฒ ๋”ฉ์„ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋ธ์ด๋ฏ€๋กœ ์ž…๋ ฅ์„ ์ผ๋ฐ˜์ ์œผ๋กœ ์™ผ์ชฝ๋ณด๋‹ค๋Š” ์˜ค๋ฅธ์ชฝ์— ํŒจ๋”ฉํ•˜๋Š” ๊ฒƒ์ด ๊ถŒ์žฅ๋ฉ๋‹ˆ๋‹ค.
  • GPT๋Š” ์ธ๊ณผ ์–ธ์–ด ๋ชจ๋ธ๋ง(Causal Language Modeling, CLM) ๋ชฉํ‘œ๋กœ ํ•™์Šต๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์‹œํ€€์Šค์—์„œ ๋‹ค์Œ ํ† ํฐ์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์ด๋ฅผ ํ™œ์šฉํ•˜๋ฉด run_generation.py ์˜ˆ์ œ ์Šคํฌ๋ฆฝํŠธ์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด GPT-2๋Š” ๊ตฌ๋ฌธ์ ์œผ๋กœ ์ผ๊ด€๋œ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฐธ๊ณ :

OpenAI GPT ๋…ผ๋ฌธ์˜ ์›๋ž˜ ํ† ํฐํ™” ๊ณผ์ •์„ ์žฌํ˜„ํ•˜๋ ค๋ฉด ftfy์™€ SpaCy๋ฅผ ์„ค์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:

pip install spacy ftfy==4.4.3
python -m spacy download en

ftfy์™€ SpaCy๋ฅผ ์„ค์น˜ํ•˜์ง€ ์•Š์œผ๋ฉด [OpenAIGPTTokenizer]๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ BERT์˜ BasicTokenizer๋ฅผ ์‚ฌ์šฉํ•œ ํ›„ Byte-Pair Encoding์„ ํ†ตํ•ด ํ† ํฐํ™”ํ•ฉ๋‹ˆ๋‹ค(๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ์šฉ์— ๋ฌธ์ œ๊ฐ€ ์—†์œผ๋‹ˆ ๊ฑฑ์ •ํ•˜์ง€ ๋งˆ์„ธ์š”).

๋ฆฌ์†Œ์Šค [[resources]]

OpenAI GPT๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋Š” ๊ณต์‹ Hugging Face ๋ฐ ์ปค๋ฎค๋‹ˆํ‹ฐ(๐ŸŒŽ ํ‘œ์‹œ) ๋ฆฌ์†Œ์Šค ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— ๋ฆฌ์†Œ์Šค๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด, Pull Request๋ฅผ ์—ด์–ด์ฃผ์‹œ๋ฉด ๊ฒ€ํ† ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค! ๋ฆฌ์†Œ์Šค๋Š” ๊ธฐ์กด ๋ฆฌ์†Œ์Šค๋ฅผ ๋ณต์ œํ•˜์ง€ ์•Š๊ณ  ์ƒˆ๋กœ์šด ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

OpenAIGPTConfig [[transformers.OpenAIGPTConfig]]

[[autodoc]] OpenAIGPTConfig

OpenAIGPTTokenizer [[transformers.OpenAIGPTTokenizer]]

[[autodoc]] OpenAIGPTTokenizer - save_vocabulary

OpenAIGPTTokenizerFast [[transformers.OpenAIGPTTokenizerFast]]

[[autodoc]] OpenAIGPTTokenizerFast

OpenAI specific outputs [[transformers.models.openai.modeling_openai.OpenAIGPTDoubleHeadsModelOutput]]

[[autodoc]] models.openai.modeling_openai.OpenAIGPTDoubleHeadsModelOutput

[[autodoc]] models.openai.modeling_tf_openai.TFOpenAIGPTDoubleHeadsModelOutput

OpenAIGPTModel [[transformers.OpenAIGPTModel]]

[[autodoc]] OpenAIGPTModel - forward

OpenAIGPTLMHeadModel [[transformers.OpenAIGPTLMHeadModel]]

[[autodoc]] OpenAIGPTLMHeadModel - forward

OpenAIGPTDoubleHeadsModel [[transformers.OpenAIGPTDoubleHeadsModel]]

[[autodoc]] OpenAIGPTDoubleHeadsModel - forward

OpenAIGPTForSequenceClassification [[transformers.OpenAIGPTForSequenceClassification]]

[[autodoc]] OpenAIGPTForSequenceClassification - forward

TFOpenAIGPTModel [[transformers.TFOpenAIGPTModel]]

[[autodoc]] TFOpenAIGPTModel - call

TFOpenAIGPTLMHeadModel [[transformers.TFOpenAIGPTLMHeadModel]]

[[autodoc]] TFOpenAIGPTLMHeadModel - call

TFOpenAIGPTDoubleHeadsModel [[transformers.TFOpenAIGPTDoubleHeadsModel]]

[[autodoc]] TFOpenAIGPTDoubleHeadsModel - call

TFOpenAIGPTForSequenceClassification [[transformers.TFOpenAIGPTForSequenceClassification]]

[[autodoc]] TFOpenAIGPTForSequenceClassification - call