| language: | |
| - ko | |
| tags: | |
| - roberta | |
| - tokenizer only | |
| license: | |
| - mit | |
| ## ๋ผ์ด๋ธ๋ฌ๋ฆฌ ๋ฒ์ | |
| - transformers: 4.21.2 | |
| - datasets: 2.4.0 | |
| - tokenizers: 0.12.1 | |
| [Bingsu/ko_BBPE_tokenizer_roberta](https://huggingface.co/Bingsu/ko_BBPE_tokenizer_roberta)์ ๊ฐ์ ๋ฐฉ๋ฒ์ผ๋ก ํ๋ จํ ํ ํฌ๋์ด์ . | |
| ๋ค๋ง `unicode_normalizer="nfkc"`๋ฅผ ๋บ์ต๋๋ค. | |
| ```python | |
| tokenizer = ByteLevelBPETokenizer(trim_offsets=True) | |
| ``` | |