|
|
| --- |
| language: |
| - vi |
| - ede |
|
|
| tags: |
| - splade |
| - unigram |
| - sentencepiece |
| - information-retrieval |
| - cross-lingual-retrieval |
| - EViRAL |
| --- |
| |
| # SPLADE + Unigram Tokenizer — EViRAL |
|
|
| Cross-lingual retrieval model for: |
|
|
| Ede query → Vietnamese passage retrieval |
|
|
| ## Files |
|
|
| | File | Description | |
| |---|---| |
| | mlm.pt | MLM pre-trained SPLADE encoder | |
| | align.pt | Cross-lingual aligned encoder | |
| | finetune.pt | Fine-tuned retrieval encoder | |
| | spm_tokenizer/spm.model | SentencePiece unigram model | |
| | spm_tokenizer/spm.vocab | SentencePiece vocabulary | |
|
|
| ## Tokenizer |
|
|
| Tokenizer type: SentencePiece Unigram |
|
|