colbert_unigram / README.md
HeyDunaX's picture
add model card
d551d1f verified
---
language:
- vi
- ede
tags:
- cross-lingual-retrieval
- sentencepiece-tokenizer
- colbert
- EViRAL
---
# ColBERT + SentencePiece — EViRAL
Task: Ede query → Vietnamese passage retrieval
## Eval Results
| Metric | Validation | Test |
|---------|-----------|--------|
| nDCG@1 | 0.0004 | 0.0004 |
| nDCG@5 | 0.0009 | 0.0011 |
| nDCG@10 | 0.0018 | 0.0019 |
| MRR@10 | 0.0019 | 0.0020 |
| R@50 | 0.0204 | 0.0206 |
| R@100 | 0.0370 | 0.0389 |
## Checkpoints
| file | description |
|---|---|
| mlm.pt | MLM pre-trained encoder |
| align.pt | cross-lingual aligned encoder |
| finetune.pt | contrastive fine-tuned encoder (best val) |
| sp_tokenizer/spm.model | SentencePiece model |
| sp_tokenizer/spm.vocab | SentencePiece vocab |