serving the model

#13

by prudant - opened Feb 21, 2024

Feb 21, 2024

Hi there! is there any way to serve your great model considering all the model fetures, embeddings, scores for rerank purposes, open source options or paid options for serving ?

Shitao

Beijing Academy of Artificial Intelligence org Feb 22, 2024

you might consider using vespa: https://github.com/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/mother-of-all-embedding-models-cloud.ipynb

prudant

Feb 22, 2024

thanks, Vespa is overkilling for now for my use cases and testing purposes, I (with the help of the gpt) build a simple but robust usable server for FAST local testing / developing purposes, this is the link:
https://github.com/puppetm4st3r/baai_m3_simple_server

feel free to share or comment
sugestions are welcome.
regards, and congrats for the great job and model :)

MauroCE

Apr 24

If you mainly need the embedding server layer, I open-sourced m3serve, a small BGE-M3 server exposing dense + sparse embeddings, with dynamic batching https://github.com/MauroCE/m3serve

a-ivanovitch

May 18

https://hub.docker.com/r/sophiacloud/bge-m3-service

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment