Difference between bilingual-embedding-large and bilingual-embedding-large-8k

by thomlevy - opened Sep 6, 2024

Sep 6, 2024

Hello,
Thanks for this model.
This doesn't seem obvious from the model card.
What is the difference between bilingual-embedding-large and bilingual-embedding-large-8k?
What does the "8k" suffix means?
Thanks

dangvantuan

La Javaness org Sep 6, 2024

Hi @thomlevy
Token input of bilingual-embedding-large is 512 tokens, and bilingual-embedding-large-8k is 8096 tokens.
Tuan

thomlevy

Sep 9, 2024

Hi @dangvantuan
Thanks. Very clear.
May be it could help if you could mention this updated max_seq_length in the model card:
https://huggingface.co/Lajavaness/bilingual-embedding-large-8k#full-model-architecture
Thomas

thomlevy changed discussion status to closed Sep 9, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment