MASR / transformers /docs /source /ko /create_a_model.md
Yuvarraj's picture
Initial commit
a0db2f9
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
โš ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# ๋งž์ถคํ˜• ์•„ํ‚คํ…์ฒ˜ ๋งŒ๋“ค๊ธฐ[[create-a-custom-architecture]]
[`AutoClass`](model_doc/auto)๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ž๋™์œผ๋กœ ์ถ”๋ก ํ•˜๊ณ  ๋ฏธ๋ฆฌ ํ•™์Šต๋œ configuration๊ณผ ๊ฐ€์ค‘์น˜๋ฅผ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์ฒดํฌํฌ์ธํŠธ์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š” ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด `AutoClass`๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ํŠน์ • ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋ณด๋‹ค ์„ธ๋ฐ€ํ•˜๊ฒŒ ์ œ์–ดํ•˜๊ณ ์ž ํ•˜๋Š” ์‚ฌ์šฉ์ž๋Š” ๋ช‡ ๊ฐ€์ง€ ๊ธฐ๋ณธ ํด๋ž˜์Šค๋งŒ์œผ๋กœ ์ปค์Šคํ…€ ๐Ÿค— Transformers ๋ชจ๋ธ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๐Ÿค— Transformers ๋ชจ๋ธ์„ ์—ฐ๊ตฌ, ๊ต์œก ๋˜๋Š” ์‹คํ—˜ํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ๋Š” ๋ชจ๋“  ์‚ฌ์šฉ์ž์—๊ฒŒ ํŠนํžˆ ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฐ€์ด๋“œ์—์„œ๋Š” 'AutoClass'๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์ปค์Šคํ…€ ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:
- ๋ชจ๋ธ configuration์„ ๊ฐ€์ ธ์˜ค๊ณ  ์‚ฌ์šฉ์ž ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
- ํ…์ŠคํŠธ์— ์‚ฌ์šฉํ•  ๋А๋ฆฌ๊ฑฐ๋‚˜ ๋น ๋ฅธ ํ† ํฐํ™”๊ธฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
- ๋น„์ „ ์ž‘์—…์„ ์œ„ํ•œ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
- ์˜ค๋””์˜ค ์ž‘์—…์„ ์œ„ํ•œ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
- ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž‘์—…์šฉ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
## Configuration[[configuration]]
[configuration](main_classes/configuration)์€ ๋ชจ๋ธ์˜ ํŠน์ • ์†์„ฑ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๊ฐ ๋ชจ๋ธ ๊ตฌ์„ฑ์—๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์†์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋ชจ๋“  NLP ๋ชจ๋ธ์—๋Š” `hidden_size`, `num_attention_heads`, `num_hidden_layers` ๋ฐ `vocab_size` ์†์„ฑ์ด ๊ณตํ†ต์œผ๋กœ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์†์„ฑ์€ ๋ชจ๋ธ์„ ๊ตฌ์„ฑํ•  attention heads ๋˜๋Š” hidden layers์˜ ์ˆ˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
[DistilBERT](model_doc/distilbert) ์†์„ฑ์„ ๊ฒ€์‚ฌํ•˜๊ธฐ ์œ„ํ•ด [`DistilBertConfig`]์— ์ ‘๊ทผํ•˜์—ฌ ์ž์„ธํžˆ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค:
```py
>>> from transformers import DistilBertConfig
>>> config = DistilBertConfig()
>>> print(config)
DistilBertConfig {
"activation": "gelu",
"attention_dropout": 0.1,
"dim": 768,
"dropout": 0.1,
"hidden_dim": 3072,
"initializer_range": 0.02,
"max_position_embeddings": 512,
"model_type": "distilbert",
"n_heads": 12,
"n_layers": 6,
"pad_token_id": 0,
"qa_dropout": 0.1,
"seq_classif_dropout": 0.2,
"sinusoidal_pos_embds": false,
"transformers_version": "4.16.2",
"vocab_size": 30522
}
```
[`DistilBertConfig`]๋Š” ๊ธฐ๋ณธ [`DistilBertModel`]์„ ๋นŒ๋“œํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋“  ๊ธฐ๋ณธ ์†์„ฑ์„ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ์†์„ฑ์€ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•์ด ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์‹คํ—˜์„ ์œ„ํ•œ ๊ณต๊ฐ„์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๊ธฐ๋ณธ ๋ชจ๋ธ์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ปค์Šคํ„ฐ๋งˆ์ด์ฆˆํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
- `activation` ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๋‹ค๋ฅธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์„ธ์š”.
- `attention_dropout` ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์–ดํ…์…˜ ํ™•๋ฅ ์— ๋” ๋†’์€ ๋“œ๋กญ์•„์›ƒ ๋น„์œจ์„ ์‚ฌ์šฉํ•˜์„ธ์š”.
```py
>>> my_config = DistilBertConfig(activation="relu", attention_dropout=0.4)
>>> print(my_config)
DistilBertConfig {
"activation": "relu",
"attention_dropout": 0.4,
"dim": 768,
"dropout": 0.1,
"hidden_dim": 3072,
"initializer_range": 0.02,
"max_position_embeddings": 512,
"model_type": "distilbert",
"n_heads": 12,
"n_layers": 6,
"pad_token_id": 0,
"qa_dropout": 0.1,
"seq_classif_dropout": 0.2,
"sinusoidal_pos_embds": false,
"transformers_version": "4.16.2",
"vocab_size": 30522
}
```
์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ ์†์„ฑ์€ [`~PretrainedConfig.from_pretrained`] ํ•จ์ˆ˜์—์„œ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert-base-uncased", activation="relu", attention_dropout=0.4)
```
๋ชจ๋ธ ๊ตฌ์„ฑ์ด ๋งŒ์กฑ์Šค๋Ÿฌ์šฐ๋ฉด [`~PretrainedConfig.save_pretrained`]๋กœ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์„ค์ • ํŒŒ์ผ์€ ์ง€์ •๋œ ์ž‘์—… ๊ฒฝ๋กœ์— JSON ํŒŒ์ผ๋กœ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค:
```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
```
configuration ํŒŒ์ผ์„ ์žฌ์‚ฌ์šฉํ•˜๋ ค๋ฉด [`~PretrainedConfig.from_pretrained`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ ธ์˜ค์„ธ์š”:
```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
```
<Tip>
configuration ํŒŒ์ผ์„ ๋”•์…”๋„ˆ๋ฆฌ๋กœ ์ €์žฅํ•˜๊ฑฐ๋‚˜ ์‚ฌ์šฉ์ž ์ •์˜ configuration ์†์„ฑ๊ณผ ๊ธฐ๋ณธ configuration ์†์„ฑ์˜ ์ฐจ์ด์ ๋งŒ ์ €์žฅํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค! ์ž์„ธํ•œ ๋‚ด์šฉ์€ [configuration](main_classes/configuration) ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
</Tip>
## ๋ชจ๋ธ[[model]]
๋‹ค์Œ ๋‹จ๊ณ„๋Š” [๋ชจ๋ธ(model)](main_classes/models)์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋А์Šจํ•˜๊ฒŒ ์•„ํ‚คํ…์ฒ˜๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋Š” ๋ชจ๋ธ์€ ๊ฐ ๊ณ„์ธต์ด ์ˆ˜ํ–‰ํ•˜๋Š” ๋™์ž‘๊ณผ ๋ฐœ์ƒํ•˜๋Š” ์ž‘์—…์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. configuration์˜ `num_hidden_layers`์™€ ๊ฐ™์€ ์†์„ฑ์€ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ •์˜ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๋ชจ๋ธ์€ ๊ธฐ๋ณธ ํด๋ž˜์Šค [`PreTrainedModel`]๊ณผ ์ž…๋ ฅ ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ ์กฐ์ • ๋ฐ ์…€ํ”„ ์–ดํ…์…˜ ํ—ค๋“œ ๊ฐ€์ง€ ์น˜๊ธฐ์™€ ๊ฐ™์€ ๋ช‡ ๊ฐ€์ง€ ์ผ๋ฐ˜์ ์ธ ๋ฉ”์†Œ๋“œ๋ฅผ ๊ณต์œ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ชจ๋“  ๋ชจ๋ธ์€ [`torch.nn.Module`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html), [`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model) ๋˜๋Š” [`flax.linen.Module`](https://flax.readthedocs.io/en/latest/flax.linen.html#module)์˜ ์„œ๋ธŒํด๋ž˜์Šค์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ์€ ๊ฐ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์‚ฌ์šฉ๋ฒ•๊ณผ ํ˜ธํ™˜๋ฉ๋‹ˆ๋‹ค.
<frameworkcontent>
<pt>
์‚ฌ์šฉ์ž ์ง€์ • configuration ์†์„ฑ์„ ๋ชจ๋ธ์— ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค:
```py
>>> from transformers import DistilBertModel
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
>>> model = DistilBertModel(my_config)
```
์ด์ œ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ๋Œ€์‹  ์ž„์˜์˜ ๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋ธ์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ธฐ ์ „๊นŒ์ง€๋Š” ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํ›ˆ๋ จ์€ ๋น„์šฉ๊ณผ ์‹œ๊ฐ„์ด ๋งŽ์ด ์†Œ์š”๋˜๋Š” ํ”„๋กœ์„ธ์Šค์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ํ›ˆ๋ จ์— ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค์˜ ์ผ๋ถ€๋งŒ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ๋” ๋นจ๋ฆฌ ์–ป์œผ๋ ค๋ฉด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ [`~PreTrainedModel.from_pretrained`]๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> model = DistilBertModel.from_pretrained("distilbert-base-uncased")
```
๐Ÿค— Transformers์—์„œ ์ œ๊ณตํ•œ ๋ชจ๋ธ์˜ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration์„ ์ž๋™์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration ์†์„ฑ์˜ ์ผ๋ถ€ ๋˜๋Š” ์ „๋ถ€๋ฅผ ์‚ฌ์šฉ์ž ์ง€์ •์œผ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```py
>>> model = DistilBertModel.from_pretrained("distilbert-base-uncased", config=my_config)
```
</pt>
<tf>
์‚ฌ์šฉ์ž ์ง€์ • configuration ์†์„ฑ์„ ๋ชจ๋ธ์— ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค:
```py
>>> from transformers import TFDistilBertModel
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
>>> tf_model = TFDistilBertModel(my_config)
```
์ด์ œ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ๋Œ€์‹  ์ž„์˜์˜ ๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋ธ์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ธฐ ์ „๊นŒ์ง€๋Š” ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํ›ˆ๋ จ์€ ๋น„์šฉ๊ณผ ์‹œ๊ฐ„์ด ๋งŽ์ด ์†Œ์š”๋˜๋Š” ํ”„๋กœ์„ธ์Šค์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ํ›ˆ๋ จ์— ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค์˜ ์ผ๋ถ€๋งŒ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ๋” ๋นจ๋ฆฌ ์–ป์œผ๋ ค๋ฉด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.
์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ [`~TFPreTrainedModel.from_pretrained`]๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> tf_model = TFDistilBertModel.from_pretrained("distilbert-base-uncased")
```
๐Ÿค— Transformers์—์„œ ์ œ๊ณตํ•œ ๋ชจ๋ธ์˜ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration์„ ์ž๋™์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration ์†์„ฑ์˜ ์ผ๋ถ€ ๋˜๋Š” ์ „๋ถ€๋ฅผ ์‚ฌ์šฉ์ž ์ง€์ •์œผ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```py
>>> tf_model = TFDistilBertModel.from_pretrained("distilbert-base-uncased", config=my_config)
```
</tf>
</frameworkcontent>
### ๋ชจ๋ธ ํ—ค๋“œ[[model-heads]]
์ด ์‹œ์ ์—์„œ *์€๋‹‰ ์ƒํƒœ(hidden state)*๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๊ธฐ๋ณธ DistilBERT ๋ชจ๋ธ์„ ๊ฐ–๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์€๋‹‰ ์ƒํƒœ๋Š” ์ตœ์ข… ์ถœ๋ ฅ์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ๋ชจ๋ธ ํ—ค๋“œ์— ์ž…๋ ฅ์œผ๋กœ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. ๐Ÿค— Transformers๋Š” ๋ชจ๋ธ์ด ํ•ด๋‹น ์ž‘์—…์„ ์ง€์›ํ•˜๋Š” ํ•œ ๊ฐ ์ž‘์—…๋งˆ๋‹ค ๋‹ค๋ฅธ ๋ชจ๋ธ ํ—ค๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค(์ฆ‰, ๋ฒˆ์—ญ๊ณผ ๊ฐ™์€ ์‹œํ€€์Šค ๊ฐ„ ์ž‘์—…์—๋Š” DistilBERT๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Œ).
<frameworkcontent>
<pt>
์˜ˆ๋ฅผ ๋“ค์–ด, [`DistilBertForSequenceClassification`]์€ ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๊ฐ€ ์žˆ๋Š” ๊ธฐ๋ณธ DistilBERT ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๋Š” ํ’€๋ง๋œ ์ถœ๋ ฅ ์œ„์— ์žˆ๋Š” ์„ ํ˜• ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค.
```py
>>> from transformers import DistilBertForSequenceClassification
>>> model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")
```
๋‹ค๋ฅธ ๋ชจ๋ธ ํ—ค๋“œ๋กœ ์ „ํ™˜ํ•˜์—ฌ ์ด ์ฒดํฌํฌ์ธํŠธ๋ฅผ ๋‹ค๋ฅธ ์ž‘์—…์— ์‰ฝ๊ฒŒ ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ์ž‘์—…์˜ ๊ฒฝ์šฐ, [`DistilBertForQuestionAnswering`] ๋ชจ๋ธ ํ—ค๋“œ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ํ—ค๋“œ๋Š” ์ˆจ๊ฒจ์ง„ ์ƒํƒœ ์ถœ๋ ฅ ์œ„์— ์„ ํ˜• ๋ ˆ์ด์–ด๊ฐ€ ์žˆ๋‹ค๋Š” ์ ์„ ์ œ์™ธํ•˜๋ฉด ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
```py
>>> from transformers import DistilBertForQuestionAnswering
>>> model = DistilBertForQuestionAnswering.from_pretrained("distilbert-base-uncased")
```
</pt>
<tf>
์˜ˆ๋ฅผ ๋“ค์–ด, [`TFDistilBertForSequenceClassification`]์€ ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๊ฐ€ ์žˆ๋Š” ๊ธฐ๋ณธ DistilBERT ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๋Š” ํ’€๋ง๋œ ์ถœ๋ ฅ ์œ„์— ์žˆ๋Š” ์„ ํ˜• ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค.
```py
>>> from transformers import TFDistilBertForSequenceClassification
>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")
```
๋‹ค๋ฅธ ๋ชจ๋ธ ํ—ค๋“œ๋กœ ์ „ํ™˜ํ•˜์—ฌ ์ด ์ฒดํฌํฌ์ธํŠธ๋ฅผ ๋‹ค๋ฅธ ์ž‘์—…์— ์‰ฝ๊ฒŒ ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ์ž‘์—…์˜ ๊ฒฝ์šฐ, [`TFDistilBertForQuestionAnswering`] ๋ชจ๋ธ ํ—ค๋“œ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ํ—ค๋“œ๋Š” ์ˆจ๊ฒจ์ง„ ์ƒํƒœ ์ถœ๋ ฅ ์œ„์— ์„ ํ˜• ๋ ˆ์ด์–ด๊ฐ€ ์žˆ๋‹ค๋Š” ์ ์„ ์ œ์™ธํ•˜๋ฉด ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
```py
>>> from transformers import TFDistilBertForQuestionAnswering
>>> tf_model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-uncased")
```
</tf>
</frameworkcontent>
## ํ† ํฌ๋‚˜์ด์ €[[tokenizer]]
ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์— ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋งˆ์ง€๋ง‰์œผ๋กœ ํ•„์š”ํ•œ ๊ธฐ๋ณธ ํด๋ž˜์Šค๋Š” ์›์‹œ ํ…์ŠคํŠธ๋ฅผ ํ…์„œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” [ํ† ํฌ๋‚˜์ด์ €](main_classes/tokenizer)์ž…๋‹ˆ๋‹ค. ๐Ÿค— Transformers์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ† ํฌ๋‚˜์ด์ €๋Š” ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค:
- [`PreTrainedTokenizer`]: ํŒŒ์ด์ฌ์œผ๋กœ ๊ตฌํ˜„๋œ ํ† ํฌ๋‚˜์ด์ €์ž…๋‹ˆ๋‹ค.
- [`PreTrainedTokenizerFast`]: Rust ๊ธฐ๋ฐ˜ [๐Ÿค— Tokenizer](https://huggingface.co/docs/tokenizers/python/latest/) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ๋งŒ๋“ค์–ด์ง„ ํ† ํฌ๋‚˜์ด์ €์ž…๋‹ˆ๋‹ค. ์ด ํ† ํฌ๋‚˜์ด์ €๋Š” Rust๋กœ ๊ตฌํ˜„๋˜์–ด ๋ฐฐ์น˜ ํ† ํฐํ™”์—์„œ ํŠนํžˆ ๋น ๋ฆ…๋‹ˆ๋‹ค. ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋Š” ํ† ํฐ์„ ์›๋ž˜ ๋‹จ์–ด๋‚˜ ๋ฌธ์ž์— ๋งคํ•‘ํ•˜๋Š” *์˜คํ”„์…‹ ๋งคํ•‘*๊ณผ ๊ฐ™์€ ์ถ”๊ฐ€ ๋ฉ”์†Œ๋“œ๋„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๋‘ ํ† ํฌ๋‚˜์ด์ € ๋ชจ๋‘ ์ธ์ฝ”๋”ฉ ๋ฐ ๋””์ฝ”๋”ฉ, ์ƒˆ ํ† ํฐ ์ถ”๊ฐ€, ํŠน์ˆ˜ ํ† ํฐ ๊ด€๋ฆฌ์™€ ๊ฐ™์€ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
<Tip warning={true}>
๋ชจ๋“  ๋ชจ๋ธ์ด ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ์ด [ํ‘œ](index#supported-frameworks)์—์„œ ๋ชจ๋ธ์˜ ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ € ์ง€์› ์—ฌ๋ถ€๋ฅผ ํ™•์ธํ•˜์„ธ์š”.
</Tip>
ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ง์ ‘ ํ•™์Šตํ•œ ๊ฒฝ์šฐ, *์–ดํœ˜(vocabulary)* ํŒŒ์ผ์—์„œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```py
>>> from transformers import DistilBertTokenizer
>>> my_tokenizer = DistilBertTokenizer(vocab_file="my_vocab_file.txt", do_lower_case=False, padding_side="left")
```
์‚ฌ์šฉ์ž ์ง€์ • ํ† ํฌ๋‚˜์ด์ €์˜ ์–ดํœ˜๋Š” ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ํ† ํฌ๋‚˜์ด์ €์—์„œ ์ƒ์„ฑ๋œ ์–ดํœ˜์™€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์„ ๊ธฐ์–ตํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ์–ดํœ˜๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋ฉฐ, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ์ž…๋ ฅ์ด ์˜๋ฏธ๋ฅผ ๊ฐ–์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. [`DistilBertTokenizer`] ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ์–ดํœ˜๋กœ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import DistilBertTokenizer
>>> slow_tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
```
[`DistilBertTokenizerFast`] ํด๋ž˜์Šค๋กœ ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import DistilBertTokenizerFast
>>> fast_tokenizer = DistilBertTokenizerFast.from_pretrained("distilbert-base-uncased")
```
<Tip>
[`AutoTokenizer`]๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๊ฐ€์ ธ์˜ค๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋™์ž‘์„ ๋น„ํ™œ์„ฑํ™”ํ•˜๋ ค๋ฉด `from_pretrained`์—์„œ `use_fast=False`๋ฅผ ์„ค์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
</Tip>
## ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ[[image-processor]]
์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ(image processor)๋Š” ๋น„์ „ ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ [`~image_processing_utils.ImageProcessingMixin`] ํด๋ž˜์Šค์—์„œ ์ƒ์†ํ•ฉ๋‹ˆ๋‹ค.
์‚ฌ์šฉํ•˜๋ ค๋ฉด ์‚ฌ์šฉ ์ค‘์ธ ๋ชจ๋ธ๊ณผ ์—ฐ๊ฒฐ๋œ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์— [ViT](model_doc/vit)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ [`ViTImageProcessor`]๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import ViTImageProcessor
>>> vit_extractor = ViTImageProcessor()
>>> print(vit_extractor)
ViTImageProcessor {
"do_normalize": true,
"do_resize": true,
"feature_extractor_type": "ViTImageProcessor",
"image_mean": [
0.5,
0.5,
0.5
],
"image_std": [
0.5,
0.5,
0.5
],
"resample": 2,
"size": 224
}
```
<Tip>
์‚ฌ์šฉ์ž ์ง€์ •์„ ์›ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ `from_pretrained` ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค.
</Tip>
์‚ฌ์šฉ์ž ์ง€์ • ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด [`ViTImageProcessor`] ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import ViTImageProcessor
>>> my_vit_extractor = ViTImageProcessor(resample="PIL.Image.BOX", do_normalize=False, image_mean=[0.3, 0.3, 0.3])
>>> print(my_vit_extractor)
ViTImageProcessor {
"do_normalize": false,
"do_resize": true,
"feature_extractor_type": "ViTImageProcessor",
"image_mean": [
0.3,
0.3,
0.3
],
"image_std": [
0.5,
0.5,
0.5
],
"resample": "PIL.Image.BOX",
"size": 224
}
```
## ํŠน์„ฑ ์ถ”์ถœ๊ธฐ[[feature-extractor]]
ํŠน์„ฑ ์ถ”์ถœ๊ธฐ(feature extractor)๋Š” ์˜ค๋””์˜ค ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ [`~feature_extraction_utils.FeatureExtractionMixin`] ํด๋ž˜์Šค์—์„œ ์ƒ์†๋˜๋ฉฐ, ์˜ค๋””์˜ค ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด [`SequenceFeatureExtractor`] ํด๋ž˜์Šค์—์„œ ์ƒ์†ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉํ•˜๋ ค๋ฉด ์‚ฌ์šฉ ์ค‘์ธ ๋ชจ๋ธ๊ณผ ์—ฐ๊ฒฐ๋œ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์˜ค๋””์˜ค ๋ถ„๋ฅ˜์— [Wav2Vec2](model_doc/wav2vec2)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ [`Wav2Vec2FeatureExtractor`]๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> w2v2_extractor = Wav2Vec2FeatureExtractor()
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
"do_normalize": true,
"feature_extractor_type": "Wav2Vec2FeatureExtractor",
"feature_size": 1,
"padding_side": "right",
"padding_value": 0.0,
"return_attention_mask": false,
"sampling_rate": 16000
}
```
<Tip>
์‚ฌ์šฉ์ž ์ง€์ •์ด ํ•„์š”ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ `from_pretrained` ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ ใ…๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ถˆ๋Ÿฌ ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค.
</Tip>
์‚ฌ์šฉ์ž ์ง€์ • ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ๋งŒ๋“ค๋ ค๋ฉด [`Wav2Vec2FeatureExtractor`] ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> w2v2_extractor = Wav2Vec2FeatureExtractor(sampling_rate=8000, do_normalize=False)
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
"do_normalize": false,
"feature_extractor_type": "Wav2Vec2FeatureExtractor",
"feature_size": 1,
"padding_side": "right",
"padding_value": 0.0,
"return_attention_mask": false,
"sampling_rate": 8000
}
```
## ํ”„๋กœ์„ธ์„œ[[processor]]
๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž‘์—…์„ ์ง€์›ํ•˜๋Š” ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ, ๐Ÿค— Transformers๋Š” ํŠน์„ฑ ์ถ”์ถœ๊ธฐ ๋ฐ ํ† ํฌ๋‚˜์ด์ €์™€ ๊ฐ™์€ ์ฒ˜๋ฆฌ ํด๋ž˜์Šค๋ฅผ ๋‹จ์ผ ๊ฐ์ฒด๋กœ ํŽธ๋ฆฌํ•˜๊ฒŒ ๋ž˜ํ•‘ํ•˜๋Š” ํ”„๋กœ์„ธ์„œ ํด๋ž˜์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ž๋™ ์Œ์„ฑ ์ธ์‹ ์ž‘์—…(Automatic Speech Recognition task (ASR))์— [`Wav2Vec2Processor`]๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ž๋™ ์Œ์„ฑ ์ธ์‹ ์ž‘์—…์€ ์˜ค๋””์˜ค๋ฅผ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜๋ฏ€๋กœ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ์™€ ํ† ํฌ๋‚˜์ด์ €๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
์˜ค๋””์˜ค ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•  ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค:
```py
>>> from transformers import Wav2Vec2FeatureExtractor
>>> feature_extractor = Wav2Vec2FeatureExtractor(padding_value=1.0, do_normalize=True)
```
ํ…์ŠคํŠธ ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•  ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค:
```py
>>> from transformers import Wav2Vec2CTCTokenizer
>>> tokenizer = Wav2Vec2CTCTokenizer(vocab_file="my_vocab_file.txt")
```
[`Wav2Vec2Processor`]์—์„œ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ์™€ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๊ฒฐํ•ฉํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from transformers import Wav2Vec2Processor
>>> processor = Wav2Vec2Processor(feature_extractor=feature_extractor, tokenizer=tokenizer)
```
configuration๊ณผ ๋ชจ๋ธ์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ๊ธฐ๋ณธ ํด๋ž˜์Šค์™€ ์ถ”๊ฐ€ ์ „์ฒ˜๋ฆฌ ํด๋ž˜์Šค(ํ† ํฌ๋‚˜์ด์ €, ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ, ํŠน์„ฑ ์ถ”์ถœ๊ธฐ ๋˜๋Š” ํ”„๋กœ์„ธ์„œ)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๐Ÿค— Transformers์—์„œ ์ง€์›ํ•˜๋Š” ๋ชจ๋“  ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฐ ๊ธฐ๋ณธ ํด๋ž˜์Šค๋Š” ๊ตฌ์„ฑ์ด ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์›ํ•˜๋Š” ํŠน์ • ์†์„ฑ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต์„ ์œ„ํ•ด ๋ชจ๋ธ์„ ์‰ฝ๊ฒŒ ์„ค์ •ํ•˜๊ฑฐ๋‚˜ ๊ธฐ์กด์˜ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์ˆ˜์ •ํ•˜์—ฌ ๋ฏธ์„ธ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.