Bingsu commited on
Commit
efdcfc6
·
1 Parent(s): 6e4350a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ko
4
+ tags:
5
+ - roberta
6
+ - tokenizer only
7
+ license:
8
+ - mit
9
+ ---
10
+
11
+ ## 라이브러리 버전
12
+
13
+ - transformers: 4.21.2
14
+ - datasets: 2.4.0
15
+ - tokenizers: 0.12.1
16
+
17
+ [Bingsu/ko_BBPE_tokenizer_roberta](https://huggingface.co/Bingsu/ko_BBPE_tokenizer_roberta)와 같은 방법으로 훈련한 토크나이저.
18
+
19
+ 다만 `unicode_normalizer="nfkc"`를 뺐습니다.
20
+
21
+ ```python
22
+ tokenizer = ByteLevelBPETokenizer(trim_offsets=True)
23
+ ```