Cong123779 commited on
Commit
9cfc197
·
verified ·
1 Parent(s): f4e986b

Upload data/processed/shared_vocab_info.json

Browse files
data/processed/shared_vocab_info.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 32002,
3
+ "sos_id": 1,
4
+ "eos_id": 2,
5
+ "pad_id": 0,
6
+ "unk_id": 3,
7
+ "en2vi_id": 32000,
8
+ "vi2en_id": 32001,
9
+ "tokenizer_path": "/home/alida/Documents/Cursor/NLP/data/processed/tokenizer_shared.json"
10
+ }