HeyDunaX commited on
Commit
313d8a3
·
verified ·
1 Parent(s): c3f2db5

add model card

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - vi
4
+ - ede
5
+ tags:
6
+ - cross-lingual-retrieval
7
+ - morpheme-tokenizer
8
+ - vanilla-transformer
9
+ - EViRAL
10
+ ---
11
+
12
+ # Vanilla Transformer + Morpheme Tokenizer — EViRAL
13
+
14
+ Task: Ede query → Vietnamese passage retrieval
15
+ Config: 6 layers / hidden 512 / 8 heads / FFN 2048
16
+ Tokenizer: corpus-driven morpheme segmentation + Ede-only synonym buffer
17
+
18
+ ## Checkpoints
19
+ | file | description |
20
+ |------|-------------|
21
+ | mlm.pt | MLM pre-trained encoder |
22
+ | align.pt | cross-lingual aligned encoder |
23
+ | finetune.pt | contrastive fine-tuned encoder (best val) |
24
+
25
+ ## Vocab size
26
+ `32000`