File size: 766 Bytes
d551d1f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
language:
- vi
- ede
tags:
- cross-lingual-retrieval
- sentencepiece-tokenizer
- colbert
- EViRAL
---

# ColBERT + SentencePiece — EViRAL

Task: Ede query → Vietnamese passage retrieval

## Eval Results

| Metric  | Validation | Test   |
|---------|-----------|--------|
| nDCG@1  | 0.0004    | 0.0004 |
| nDCG@5  | 0.0009    | 0.0011 |
| nDCG@10 | 0.0018    | 0.0019 |
| MRR@10  | 0.0019    | 0.0020 |
| R@50    | 0.0204    | 0.0206 |
| R@100   | 0.0370    | 0.0389 |

## Checkpoints
| file | description |
|---|---|
| mlm.pt | MLM pre-trained encoder |
| align.pt | cross-lingual aligned encoder |
| finetune.pt | contrastive fine-tuned encoder (best val) |
| sp_tokenizer/spm.model | SentencePiece model |
| sp_tokenizer/spm.vocab | SentencePiece vocab |