--- language: - vi - ede tags: - cross-lingual-retrieval - sentencepiece-tokenizer - colbert - EViRAL --- # ColBERT + SentencePiece — EViRAL Task: Ede query → Vietnamese passage retrieval ## Eval Results | Metric | Validation | Test | |---------|-----------|--------| | nDCG@1 | 0.0004 | 0.0004 | | nDCG@5 | 0.0009 | 0.0011 | | nDCG@10 | 0.0018 | 0.0019 | | MRR@10 | 0.0019 | 0.0020 | | R@50 | 0.0204 | 0.0206 | | R@100 | 0.0370 | 0.0389 | ## Checkpoints | file | description | |---|---| | mlm.pt | MLM pre-trained encoder | | align.pt | cross-lingual aligned encoder | | finetune.pt | contrastive fine-tuned encoder (best val) | | sp_tokenizer/spm.model | SentencePiece model | | sp_tokenizer/spm.vocab | SentencePiece vocab |