File size: 766 Bytes
d551d1f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | ---
language:
- vi
- ede
tags:
- cross-lingual-retrieval
- sentencepiece-tokenizer
- colbert
- EViRAL
---
# ColBERT + SentencePiece — EViRAL
Task: Ede query → Vietnamese passage retrieval
## Eval Results
| Metric | Validation | Test |
|---------|-----------|--------|
| nDCG@1 | 0.0004 | 0.0004 |
| nDCG@5 | 0.0009 | 0.0011 |
| nDCG@10 | 0.0018 | 0.0019 |
| MRR@10 | 0.0019 | 0.0020 |
| R@50 | 0.0204 | 0.0206 |
| R@100 | 0.0370 | 0.0389 |
## Checkpoints
| file | description |
|---|---|
| mlm.pt | MLM pre-trained encoder |
| align.pt | cross-lingual aligned encoder |
| finetune.pt | contrastive fine-tuned encoder (best val) |
| sp_tokenizer/spm.model | SentencePiece model |
| sp_tokenizer/spm.vocab | SentencePiece vocab |
|