| <!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| โ ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| <div style="float: right;"> | |
| <div class="flex flex-wrap space-x-1"> | |
| <img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white"> | |
| <img alt="TensorFlow" src="https://img.shields.io/badge/TensorFlow-FF6F00?style=flat&logo=tensorflow&logoColor=white"> | |
| <img alt="FlashAttention" src="https://img.shields.io/badge/%E2%9A%A1%EF%B8%8E%20FlashAttention-eae0c8?style=flat"> | |
| <img alt="SDPA" src="https://img.shields.io/badge/SDPA-DE3412?style=flat&logo=pytorch&logoColor=white"> | |
| </div> | |
| </div> | |
| # GPT-2[[gpt-2]] | |
| [GPT-2](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)๋ GPT์ ํ์ฅ ๋ฒ์ ์ผ๋ก, ์ธ๊ณผ์ ํธ๋์คํฌ๋จธ ์ธ์ด ๋ชจ๋ธ์ด๋ฉฐ, 10๋ฐฐ ๋ ๋ง์ ๋งค๊ฐ๋ณ์์ ํ์ต ๋ฐ์ดํฐ๋ฅผ ๊ฐ์ง๊ณ ์์ต๋๋ค. ์ด ๋ชจ๋ธ์ ์ด์ ์ ๋ชจ๋ ๋จ์ด๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ๋ค์ ๋จ์ด๋ฅผ ์์ธกํ๋๋ก 40GB ๋ฐ์ดํฐ ์ธํธ์์ ์ฌ์ ํ์ต๋์์ต๋๋ค. ์ด๋ฌํ ์ ๊ทผ ๋ฐฉ์์ ํตํด ์ด ๋ชจ๋ธ์ ์ ๋ก์ท ์ค์ ์์ ๋ง์ ๋ค์ด์คํธ๋ฆผ ์์ ์ ์ํํ ์ ์๊ฒ ๋์์ต๋๋ค. | |
| ๋ชจ๋ธ ์ํคํ ์ฒ๋ ๊ฐ ํ ํฐ์ด ์ด์ ํ ํฐ์๋ง ์ฃผ์๋ฅผ ๊ธฐ์ธ์ผ ์ ์๋ ๋จ๋ฐฉํฅ(์ธ๊ณผ์ ) ์ดํ ์ ๋ฉ์ปค๋์ฆ์ ์ฌ์ฉํ๋ฏ๋ก, ํ ์คํธ ์์ฑ ์์ ์ ํนํ ํจ๊ณผ์ ์ ๋๋ค. | |
| ๋ชจ๋ ์๋ณธ GPT-2 ์ฒดํฌํฌ์ธํธ๋ [OpenAI community](https://huggingface.co/openai-community?search_models=gpt) ์กฐ์ง์์ ์ฐพ์ ์ ์์ต๋๋ค. | |
| > [!TIP] | |
| > ์ค๋ฅธ์ชฝ ์ฌ์ด๋๋ฐ์ GPT-2 ๋ชจ๋ธ์ ํด๋ฆญํ์ฌ GPT-2๋ฅผ ๋ค์ํ ์ธ์ด ์์ ์ ์ ์ฉํ๋ ๋ ๋ง์ ์์๋ฅผ ํ์ธํ์ธ์. | |
| ์๋ ์์๋ [`Pipeline`] ๋๋ [`AutoModel`], ๊ทธ๋ฆฌ๊ณ ๋ช ๋ น์ค์์ GPT-2๋ก ํ ์คํธ๋ฅผ ์์ฑํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค. | |
| <hfoptions id="usage"> | |
| <hfoption id="Pipeline"> | |
| ```py | |
| import torch | |
| from transformers import pipeline | |
| # ํ ์คํธ ์์ฑ์ ์ํ ํ์ดํ๋ผ์ธ ์์ฑ | |
| pipeline = pipeline(task="text-generation", model="openai-community/gpt2", dtype=torch.float16, device=0) | |
| pipeline("Hello, I'm a language model") | |
| ``` | |
| </hfoption> | |
| <hfoption id="AutoModel"> | |
| ```py | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| # ์ฌ์ ํ์ต๋ ๋ชจ๋ธ๊ณผ ํ ํฌ๋์ด์ ๋ก๋ | |
| model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2", dtype=torch.float16, device_map="auto", attn_implementation="sdpa") | |
| tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2") | |
| # ์ ๋ ฅ ํ ์คํธ๋ฅผ ํ ํฐํํ๊ณ GPU๋ก ์ด๋ | |
| input_ids = tokenizer("Hello, I'm a language model", return_tensors="pt").to("cuda") | |
| # ํ ์คํธ ์์ฑ | |
| output = model.generate(**input_ids, cache_implementation="static") | |
| print(tokenizer.decode(output[0], skip_special_tokens=True)) | |
| ``` | |
| </hfoption> | |
| <hfoption id="transformers CLI"> | |
| ```bash | |
| echo -e "Hello, I'm a language model" | transformers run --task text-generation --model openai-community/gpt2 --device 0 | |
| ``` | |
| </hfoption> | |
| </hfoptions> | |
| `transformers backend`๋ฅผ ์ฌ์ฉํ์ฌ vLLM์ผ๋ก ๋ชจ๋ธ์ ์๋นํ ์๋ ์์ต๋๋ค. | |
| ``` | |
| vllm serve openai-community/gpt2 --model-imp transformers | |
| ``` | |
| ์์ํ๋ ๊ฐ์ค์น๋ฅผ ๋ ๋ฎ์ ์ ๋ฐ๋๋ก ํํํ์ฌ ๋ํ ๋ชจ๋ธ์ ๋ฉ๋ชจ๋ฆฌ ๋ถ๋ด์ ์ค์ ๋๋ค. ์ฌ์ฉํ ์ ์๋ ๋ ๋ง์ ์์ํ ๋ฐฑ์๋์ ๋ํด์๋ [Quantization](../quantization/overview) ๊ฐ์๋ฅผ ์ฐธ์กฐํ์ธ์. | |
| ์๋ ์์๋ [bitsandbytes](../quantization/bitsandbytes)๋ฅผ ์ฌ์ฉํ์ฌ ๊ฐ์ค์น๋ง 4๋นํธ๋ก ์์ํํฉ๋๋ค. | |
| ```py | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline | |
| # ์์ํ ์ค์ ๊ตฌ์ฑ | |
| quantization_config = BitsAndBytesConfig( | |
| load_in_4bit=True, | |
| bnb_4bit_quant_type="nf4", | |
| bnb_4bit_compute_dtype="float16", | |
| bnb_4bit_use_double_quant=True | |
| ) | |
| # ์์ํ๋ ๋ชจ๋ธ ๋ก๋ | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "openai-community/gpt2-xl", | |
| quantization_config=quantization_config, | |
| device_map="auto" | |
| ) | |
| # ํ ํฌ๋์ด์ ๋ก๋ ๋ฐ ํ ์คํธ ์์ฑ | |
| tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2-xl") | |
| inputs = tokenizer("Once upon a time, there was a magical forest", return_tensors="pt").to("cuda") | |
| outputs = model.generate(**inputs, max_new_tokens=100) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ## ์ฐธ๊ณ ์ฌํญ[[notes]] | |
| - GPT-2๋ ์ ๋ ์์น ์๋ฒ ๋ฉ์ ์ฌ์ฉํ๋ฏ๋ก ์ ๋ ฅ์ ์ค๋ฅธ์ชฝ์ ํจ๋ฉํ์ธ์. | |
| - GPT-2๋ ์ด์ ์ ๊ณ์ฐ๋ ํค-๊ฐ ์ดํ ์ ์์ ์ฌ์ฌ์ฉํ ์ ์์ต๋๋ค. [`GPT2Model.forward`]์ [past_key_values](https://huggingface.co/docs/transformers//en/model_doc/gpt2#transformers.GPT2Model.forward.past_key_values) ๋งค๊ฐ๋ณ์๋ก ์ด ๊ธฐ๋ฅ์ ์ ๊ทผํ์ธ์. | |
| - [Mistral](./mistral)์ ํ์ต ์์ ์ฑ ๊ฐ์ ์ฌํญ์ ์ ์ฉํ๋ ค๋ฉด [scale_attn_by_inverse_layer_idx](https://huggingface.co/docs/transformers/en/model_doc/gpt2#transformers.GPT2Config.scale_attn_by_inverse_layer_idx)์ [reorder_and_upcast_attn](https://huggingface.co/docs/transformers/en/model_doc/gpt2#transformers.GPT2Config.reorder_and_upcast_attn) ๋งค๊ฐ๋ณ์๋ฅผ ํ์ฑํํ์ธ์. | |
| ## GPT2Config | |
| [[autodoc]] GPT2Config | |
| ## GPT2Tokenizer | |
| [[autodoc]] GPT2Tokenizer | |
| - save_vocabulary | |
| ## GPT2TokenizerFast | |
| [[autodoc]] GPT2TokenizerFast | |
| ## GPT2 ํน์ ์ถ๋ ฅ[[gpt2-specific-outputs]] | |
| [[autodoc]] models.gpt2.modeling_gpt2.GPT2DoubleHeadsModelOutput | |
| ## GPT2Model | |
| [[autodoc]] GPT2Model | |
| - forward | |
| ## GPT2LMHeadModel | |
| [[autodoc]] GPT2LMHeadModel | |
| - forward | |
| ## GPT2DoubleHeadsModel | |
| [[autodoc]] GPT2DoubleHeadsModel | |
| - forward | |
| ## GPT2ForQuestionAnswering | |
| [[autodoc]] GPT2ForQuestionAnswering | |
| - forward | |
| ## GPT2ForSequenceClassification | |
| [[autodoc]] GPT2ForSequenceClassification | |
| - forward | |
| ## GPT2ForTokenClassification | |
| [[autodoc]] GPT2ForTokenClassification | |
| - forward | |