Doohae commited on
Commit
a3bb12a
·
1 Parent(s): 08c6d28

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ELECTRA discriminator base
2
+ - pretrained with large Korean corpus datasets (30GB)
3
+ - 113M model parameters (followed google/electra-small-discriminator config)
4
+ - 35,000 vocab size
5
+ - trained for 1,000,000 steps
6
+ - built on [lassl](https://github.com/lassl/lassl) framework
7
+
8
+
9
+ pretrain-data
10
+ ┣ korean_corpus.txt
11
+ ┣ kowiki_latest.txt
12
+ ┣ modu_dialogue_v1.2.txt
13
+ ┣ modu_news_v1.1.txt
14
+ ┣ modu_news_v2.0.txt
15
+ ┣ modu_np_2021_v1.0.txt
16
+ ┣ modu_np_v1.1.txt
17
+ ┣ modu_spoken_v1.2.txt
18
+ ┗ modu_written_v1.0.txt