--- language: - vi - ede tags: - cross-lingual-retrieval - bpe-tokenizer - vanilla-transformer - EViRAL --- # Vanilla Transformer + BPE — EViRAL Task: Ede query → Vietnamese passage retrieval Config: 6 layers / hidden 512 / 8 heads / FFN 2048 Tokenizer: BPE (vocab 32 000, trained from scratch on Ede + Vi corpus) ## Checkpoints | file | description | |---|---| | mlm.pt | MLM pre-trained encoder | | align.pt | cross-lingual aligned encoder | | finetune.pt | contrastive fine-tuned encoder (best val) | | bpe_tokenizer/tokenizer.json | BPE tokenizer |