output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)phate334/multilingual-e5-large-gguf
This model was converted to GGUF format from intfloat/multilingual-e5-large using llama.cpp.
Run it
- Deploy using Docker
$ docker run -p 8080:8080 -v ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf
or Docker Compose
services:
e5-f16:
image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
ports:
- 8080:8080
volumes:
- ./multilingual-e5-large-f16.gguf:/multilingual-e5-large-f16.gguf
command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-f16.gguf
e5-q4:
image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
ports:
- 8081:8080
volumes:
- ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf
command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf
- Downloads last month
- 227
Hardware compatibility
Log In to add your hardware
4-bit
16-bit
Model tree for phate334/multilingual-e5-large-gguf
Base model
intfloat/multilingual-e5-large
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="phate334/multilingual-e5-large-gguf", filename="", )