pkedzia commited on
Commit
dbbeae1
·
1 Parent(s): fa45eb4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - pl
5
+ library_name: transformers
6
+ tags:
7
+ - tokenizer
8
+ - fast-tokenizer
9
+ - polish
10
+ datasets:
11
+ - clarin-knext/msmarco-pl
12
+ - clarin-knext/fiqa-pl
13
+ - clarin-knext/scifact-pl
14
+ - clarin-knext/nfcorpus-pl
15
+ - radlab/legal-mc4-pl
16
+ - radlab/wikipedia-pl
17
+ - radlab/kgr10
18
+ ---
19
+
20
+ This is polish fast tokenizer.
21
+
22
+ Number of documents used to train tokenizer:
23
+ - 25 088 398