生产中使用API问题

#51
by fffff123 - opened

求教生产用大规模用bge-m3的话,这个用本身的哪个方法进行向量化呢,例如单个和批量向量化一个文本或多个文本呢?demo中的例子向量化方法感觉不大好用吧

Beijing Academy of Artificial Intelligence org

The encode function of BGEM3FlagModel supports batch inference on multiple GPUs. To further accelerate the inference, you can use the TEI tool: https://github.com/huggingface/text-embeddings-inference .

The encode function of BGEM3FlagModel supports batch inference on multiple GPUs. To further accelerate the inference, you can use the TEI tool: https://github.com/huggingface/text-embeddings-inference .

where specifically?

The encode function of BGEM3FlagModel supports batch inference on multiple GPUs. To further accelerate the inference, you can use the TEI tool: https://github.com/huggingface/text-embeddings-inference .

where, specifically?

If the bottleneck is the embedding API itself, I open-sourced m3serve a lightweight BGE-M3 server with batching https://github.com/MauroCE/m3serve as a simple start

Sign up or log in to comment