UTASI
Collection
8 items β’ Updated β’ 1
Task: Ede query β Vietnamese passage retrieval Config: 6 layers / hidden 512 / 8 heads / FFN 2048 Tokenizer: corpus-driven morpheme segmentation + Ede-only synonym buffer (Vi as pivot)
| file | description |
|---|---|
| mlm.pt | MLM pre-trained encoder |
| align.pt | cross-lingual aligned encoder |
| finetune.pt | contrastive fine-tuned encoder (best val) |
| vocab.json | morpheme vocab (token β id) |
57023