File size: 560 Bytes
313d8a3
 
 
 
 
 
db9d66b
313d8a3
 
 
 
db9d66b
313d8a3
db9d66b
 
 
313d8a3
 
 
db9d66b
313d8a3
 
 
db9d66b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
language:
- vi
- ede
tags:
- cross-lingual-retrieval
- bpe-tokenizer
- vanilla-transformer
- EViRAL
---

# Vanilla Transformer + BPE — EViRAL

Task: Ede query → Vietnamese passage retrieval
Config: 6 layers / hidden 512 / 8 heads / FFN 2048
Tokenizer: BPE (vocab 32 000, trained from scratch on Ede + Vi corpus)

## Checkpoints
| file | description |
|---|---|
| mlm.pt | MLM pre-trained encoder |
| align.pt | cross-lingual aligned encoder |
| finetune.pt | contrastive fine-tuned encoder (best val) |
| bpe_tokenizer/tokenizer.json | BPE tokenizer |