RAG needs embedding model and chunker to use the same tokenizer? WHich one is it?

by dawgctor-air - opened Nov 8, 2025

Nov 8, 2025

According to the Voyage AI documentation all the other embedding models have a specific tokenizer listed in the documentation but Voyage Context 3 does not.

What I want to know is which tokenizer this model is using? I see that it is part of the files but I don't know how to include that in my docling workflow !

vquilonr

Dec 2, 2025

I think is on the tokenizer_config.json, the architecture is based on Qwen2Tokenizer, so you can use any based on this model, because the tokens will match

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment