| <!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| โ ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| <div style="float: right;"> | |
| <div class="flex flex-wrap space-x-1"> | |
| <img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white" > | |
| <img alt= "TensorFlow" src= "https://img.shields.io/badge/TensorFlow-FF6F00?style=flat&logo=tensorflow&logoColor=white" > | |
| <img alt= "Flax" src="https://img.shields.io/badge/Flax-29a79b.svg?styleโฆNu+W0m6K/I9gGPd/dfx/EN/wN62AhsBWuAAAAAElFTkSuQmCC"> | |
| <img alt="SDPA" src= "https://img.shields.io/badge/SDPA-DE3412?style=flat&logo=pytorch&logoColor=white" > | |
| </div> | |
| </div> | |
| # ALBERT[[albert]] | |
| [ALBERT](https://huggingface.co/papers/1909.11942)๋ [BERT](./bert)์ ํ์ฅ์ฑ๊ณผ ํ์ต ์ ๋ฉ๋ชจ๋ฆฌ ํ๊ณ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์ค๊ณ๋ ๋ชจ๋ธ์ ๋๋ค. ์ด ๋ชจ๋ธ์ ๋ ๊ฐ์ง ํ๋ผ๋ฏธํฐ ๊ฐ์ ๊ธฐ๋ฒ์ ๋์ ํฉ๋๋ค. ์ฒซ ๋ฒ์งธ๋ ์๋ฒ ๋ฉ ํ๋ ฌ ๋ถํด(factorized embedding parametrization)๋ก, ํฐ ์ดํ ์๋ฒ ๋ฉ ํ๋ ฌ์ ๋ ๊ฐ์ ์์ ํ๋ ฌ๋ก ๋ถํดํ์ฌ ํ๋ ์ฌ์ด์ฆ๋ฅผ ๋๋ ค๋ ํ๋ผ๋ฏธํฐ ์๊ฐ ํฌ๊ฒ ์ฆ๊ฐํ์ง ์๋๋ก ํฉ๋๋ค. ๋ ๋ฒ์งธ๋ ๊ณ์ธต ๊ฐ ํ๋ผ๋ฏธํฐ ๊ณต์ (cross-layer parameter sharing)๋ก, ์ฌ๋ฌ ๊ณ์ธต์ด ํ๋ผ๋ฏธํฐ๋ฅผ ๊ณต์ ํ์ฌ ํ์ตํด์ผ ํ ํ๋ผ๋ฏธํฐ ์๋ฅผ ์ค์ ๋๋ค. | |
| ALBERT๋ BERT์์ ๋ฐ์ํ๋ GPU/TPU ๋ฉ๋ชจ๋ฆฌ ํ๊ณ, ๊ธด ํ์ต ์๊ฐ, ๊ฐ์์ค๋ฐ ์ฑ๋ฅ ์ ํ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ๋ง๋ค์ด์ก์ต๋๋ค. ALBERT๋ ํ๋ผ๋ฏธํฐ๋ฅผ ์ค์ด๊ธฐ ์ํด ๋ ๊ฐ์ง ๊ธฐ๋ฒ์ ์ฌ์ฉํ์ฌ ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ ์ค์ด๊ณ BERT์ ํ์ต ์๋๋ฅผ ๋์ ๋๋ค: | |
| - **์๋ฒ ๋ฉ ํ๋ ฌ ๋ถํด:** ํฐ ์ดํ ์๋ฒ ๋ฉ ํ๋ ฌ์ ๋ ๊ฐ์ ๋ ์์ ํ๋ ฌ๋ก ๋ถํดํ์ฌ ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ ์ค์ ๋๋ค. | |
| - **๊ณ์ธต ๊ฐ ํ๋ผ๋ฏธํฐ ๊ณต์ :** ๊ฐ ํธ๋์คํฌ๋จธ ๊ณ์ธต๋ง๋ค ๋ณ๋์ ํ๋ผ๋ฏธํฐ๋ฅผ ํ์ตํ๋ ๋์ , ์ฌ๋ฌ ๊ณ์ธต์ด ํ๋ผ๋ฏธํฐ๋ฅผ ๊ณต์ ํ์ฌ ํ์ตํด์ผ ํ ๊ฐ์ค์น ์๋ฅผ ๋์ฑ ์ค์ ๋๋ค. | |
| ALBERT๋ BERT์ ๋ง์ฐฌ๊ฐ์ง๋ก ์ ๋ ์์น ์๋ฒ ๋ฉ(absolute position embeddings)์ ์ฌ์ฉํ๋ฏ๋ก, ์ ๋ ฅ ํจ๋ฉ์ ์ค๋ฅธ์ชฝ์ ์ ์ฉํด์ผ ํฉ๋๋ค. ์๋ฒ ๋ฉ ํฌ๊ธฐ๋ 128์ด๋ฉฐ, BERT์ 768๋ณด๋ค ์์ต๋๋ค. ALBERT๋ ํ ๋ฒ์ ์ต๋ 512๊ฐ์ ํ ํฐ์ ์ฒ๋ฆฌํ ์ ์์ต๋๋ค. | |
| ๋ชจ๋ ๊ณต์ ALBERT ์ฒดํฌํฌ์ธํธ๋ [ALBERT ์ปค๋ฎค๋ํฐ](https://huggingface.co/albert) ์กฐ์ง์์ ํ์ธํ์ค ์ ์์ต๋๋ค. | |
| > [!TIP] | |
| > ์ค๋ฅธ์ชฝ ์ฌ์ด๋๋ฐ์ ALBERT ๋ชจ๋ธ์ ํด๋ฆญํ์๋ฉด ๋ค์ํ ์ธ์ด ์์ ์ ALBERT๋ฅผ ์ ์ฉํ๋ ์์๋ฅผ ๋ ํ์ธํ์ค ์ ์์ต๋๋ค. | |
| ์๋ ์์๋ [`Pipeline`], [`AutoModel`] ๊ทธ๋ฆฌ๊ณ ์ปค๋งจ๋๋ผ์ธ์์ `[MASK]` ํ ํฐ์ ์์ธกํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค. | |
| <hfoptions id="usage"> | |
| <hfoption id="Pipeline"> | |
| ```py | |
| import torch | |
| from transformers import pipeline | |
| pipeline = pipeline( | |
| task="fill-mask", | |
| model="albert-base-v2", | |
| dtype=torch.float16, | |
| device=0 | |
| ) | |
| pipeline("์๋ฌผ์ ๊ดํฉ์ฑ์ด๋ผ๊ณ ์๋ ค์ง ๊ณผ์ ์ ํตํด [MASK]๋ฅผ ์์ฑํฉ๋๋ค.", top_k=5) | |
| ``` | |
| </hfoption> | |
| <hfoption id="AutoModel"> | |
| ```py | |
| import torch | |
| from transformers import AutoModelForMaskedLM, AutoTokenizer | |
| tokenizer = AutoTokenizer.from_pretrained("albert/albert-base-v2") | |
| model = AutoModelForMaskedLM.from_pretrained( | |
| "albert/albert-base-v2", | |
| dtype=torch.float16, | |
| attn_implementation="sdpa", | |
| device_map="auto" | |
| ) | |
| prompt = "์๋ฌผ์ [MASK]์ด๋ผ๊ณ ์๋ ค์ง ๊ณผ์ ์ ํตํด ์๋์ง๋ฅผ ์์ฑํฉ๋๋ค." | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1] | |
| predictions = outputs.logits[0, mask_token_index] | |
| top_k = torch.topk(predictions, k=5).indices.tolist() | |
| for token_id in top_k[0]: | |
| print(f"์์ธก: {tokenizer.decode([token_id])}") | |
| ``` | |
| </hfoption> | |
| <hfoption id="transformers CLI"> | |
| ```bash | |
| echo -e "Plants create [MASK] through a process known as photosynthesis." | transformers run --task fill-mask --model albert-base-v2 --device 0 | |
| ``` | |
| </hfoption> | |
| </hfoptions> | |
| ## ์ฐธ๊ณ ์ฌํญ[[notes]] | |
| - BERT๋ ์ ๋ ์์น ์๋ฒ ๋ฉ์ ์ฌ์ฉํ๋ฏ๋ก, ์ค๋ฅธ์ชฝ์ ์ ๋ ฅ์ด ํจ๋ฉ๋ผ์ผ ํฉ๋๋ค. | |
| - ์๋ฒ ๋ฉ ํฌ๊ธฐ `E`๋ ํ๋ ํฌ๊ธฐ `H`์ ๋ค๋ฆ ๋๋ค. ์๋ฒ ๋ฉ์ ๋ฌธ๋งฅ์ ๋ ๋ฆฝ์ (๊ฐ ํ ํฐ๋ง๋ค ํ๋์ ์๋ฒ ๋ฉ ๋ฒกํฐ)์ด๊ณ , ์๋ ์ํ๋ ๋ฌธ๋งฅ์ ์์กด์ (ํ ํฐ ์ํ์ค๋ง๋ค ํ๋์ ์๋ ์ํ)์ ๋๋ค. ์๋ฒ ๋ฉ ํ๋ ฌ์ `V x E`(V: ์ดํ ํฌ๊ธฐ)์ด๋ฏ๋ก, ์ผ๋ฐ์ ์ผ๋ก `H >> E`๊ฐ ๋ ๋ ผ๋ฆฌ์ ์ ๋๋ค. `E < H`์ผ ๋ ๋ชจ๋ธ ํ๋ผ๋ฏธํฐ๊ฐ ๋ ์ ์ด์ง๋๋ค. | |
| ## ์ฐธ๊ณ ์๋ฃ[[resources]] | |
| ์๋ ์น์ ์ ์๋ฃ๋ค์ ๊ณต์ Hugging Face ๋ฐ ์ปค๋ฎค๋ํฐ(๐ ํ์) ์๋ฃ๋ก, AlBERT๋ฅผ ์์ํ๋ ๋ฐ ๋์์ด ๋ฉ๋๋ค. ์ฌ๊ธฐ์ ์ถ๊ฐํ ์๋ฃ๊ฐ ์๋ค๋ฉด Pull Request๋ฅผ ๋ณด๋ด์ฃผ์ธ์! ๊ธฐ์กด ์๋ฃ์ ์ค๋ณต๋์ง ์๊ณ ์๋ก์ด ๋ด์ฉ์ ๋ด๊ณ ์์ผ๋ฉด ์ข์ต๋๋ค. | |
| <PipelineTag pipeline="text-classification"/> | |
| - [`AlbertForSequenceClassification`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification)์์ ์ง์๋ฉ๋๋ค. | |
| - [`TFAlbertForSequenceClassification`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/text-classification)์์ ์ง์๋ฉ๋๋ค. | |
| - [`FlaxAlbertForSequenceClassification`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/flax/text-classification)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification_flax.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [ํ ์คํธ ๋ถ๋ฅ ์์ ๊ฐ์ด๋](../tasks/sequence_classification)์์ ๋ชจ๋ธ ์ฌ์ฉ๋ฒ์ ํ์ธํ์ธ์. | |
| <PipelineTag pipeline="token-classification"/> | |
| - [`AlbertForTokenClassification`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/token-classification)์์ ์ง์๋ฉ๋๋ค. | |
| - [`TFAlbertForTokenClassification`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/token-classification)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/token_classification-tf.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [`FlaxAlbertForTokenClassification`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/flax/token-classification)์์ ์ง์๋ฉ๋๋ค. | |
| - ๐ค Hugging Face์ [ํ ํฐ ๋ถ๋ฅ](https://huggingface.co/course/chapter7/2?fw=pt) ๊ฐ์ข | |
| - [ํ ํฐ ๋ถ๋ฅ ์์ ๊ฐ์ด๋](../tasks/token_classification)์์ ๋ชจ๋ธ ์ฌ์ฉ๋ฒ์ ํ์ธํ์ธ์. | |
| <PipelineTag pipeline="fill-mask"/> | |
| - [`AlbertForMaskedLM`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling#robertabertdistilbert-and-masked-language-modeling)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [`TFAlbertForMaskedLM`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/language-modeling#run_mlmpy)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/language_modeling-tf.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [`FlaxAlbertForMaskedLM`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/flax/language-modeling#masked-language-modeling)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/masked_language_modeling_flax.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - ๐ค Hugging Face์ [๋ง์คํน ์ธ์ด ๋ชจ๋ธ๋ง](https://huggingface.co/course/chapter7/3?fw=pt) ๊ฐ์ข | |
| - [๋ง์คํน ์ธ์ด ๋ชจ๋ธ๋ง ์์ ๊ฐ์ด๋](../tasks/masked_language_modeling)์์ ๋ชจ๋ธ ์ฌ์ฉ๋ฒ์ ํ์ธํ์ธ์. | |
| <PipelineTag pipeline="question-answering"/> | |
| - [`AlbertForQuestionAnswering`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [`TFAlbertForQuestionAnswering`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/question-answering)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [`FlaxAlbertForQuestionAnswering`]์ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/flax/question-answering)์์ ์ง์๋ฉ๋๋ค. | |
| - [์ง์์๋ต](https://huggingface.co/course/chapter7/7?fw=pt) ๐ค Hugging Face ๊ฐ์ข์ ์ฑํฐ. | |
| - [์ง์์๋ต ์์ ๊ฐ์ด๋](../tasks/question_answering)์์ ๋ชจ๋ธ ์ฌ์ฉ๋ฒ์ ํ์ธํ์ธ์. | |
| **๋ค์ค ์ ํ(Multiple choice)** | |
| - [`AlbertForMultipleChoice`]๋ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/pytorch/multiple-choice)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [`TFAlbertForMultipleChoice`]๋ ์ด [์์ ์คํฌ๋ฆฝํธ](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/multiple-choice)์ [๋ ธํธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb)์์ ์ง์๋ฉ๋๋ค. | |
| - [๋ค์ค ์ ํ ์์ ๊ฐ์ด๋](../tasks/multiple_choice)์์ ๋ชจ๋ธ ์ฌ์ฉ๋ฒ์ ํ์ธํ์ธ์. | |
| ## AlbertConfig[[albertconfig]] | |
| [[autodoc]] AlbertConfig | |
| ## AlbertTokenizer[[alberttokenizer]] | |
| [[autodoc]] AlbertTokenizer | |
| - get_special_tokens_mask | |
| - save_vocabulary | |
| ## AlbertTokenizerFast[[alberttokenizerfast]] | |
| [[autodoc]] AlbertTokenizerFast | |
| ## Albert ํนํ ์ถ๋ ฅ[[albert-specific-outputs]] | |
| [[autodoc]] models.albert.modeling_albert.AlbertForPreTrainingOutput | |
| ## AlbertModel[[albertmodel]] | |
| [[autodoc]] AlbertModel | |
| - forward | |
| ## AlbertForPreTraining[[albertforpretraining]] | |
| [[autodoc]] AlbertForPreTraining | |
| - forward | |
| ## AlbertForMaskedLM[[albertformaskedlm]] | |
| [[autodoc]] AlbertForMaskedLM | |
| - forward | |
| ## AlbertForSequenceClassification[[albertforsequenceclassification]] | |
| [[autodoc]] AlbertForSequenceClassification | |
| - forward | |
| ## AlbertForMultipleChoice[[albertformultiplechoice]] | |
| [[autodoc]] AlbertForMultipleChoice | |
| ## AlbertForTokenClassification[[albertfortokenclassification]] | |
| [[autodoc]] AlbertForTokenClassification | |
| - forward | |
| ## AlbertForQuestionAnswering[[albertforquestionanswering]] | |
| [[autodoc]] AlbertForQuestionAnswering | |
| - forward | |