How to use from
llama.cppInstall from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf phate334/multilingual-e5-large-gguf:F16# Run inference directly in the terminal:
llama-cli -hf phate334/multilingual-e5-large-gguf:F16Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf phate334/multilingual-e5-large-gguf:F16# Run inference directly in the terminal:
./llama-cli -hf phate334/multilingual-e5-large-gguf:F16Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf phate334/multilingual-e5-large-gguf:F16# Run inference directly in the terminal:
./build/bin/llama-cli -hf phate334/multilingual-e5-large-gguf:F16Use Docker
docker model run hf.co/phate334/multilingual-e5-large-gguf:F16Quick Links
phate334/multilingual-e5-large-gguf
This model was converted to GGUF format from intfloat/multilingual-e5-large using llama.cpp.
Run it
- Deploy using Docker
$ docker run -p 8080:8080 -v ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf
or Docker Compose
services:
e5-f16:
image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
ports:
- 8080:8080
volumes:
- ./multilingual-e5-large-f16.gguf:/multilingual-e5-large-f16.gguf
command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-f16.gguf
e5-q4:
image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
ports:
- 8081:8080
volumes:
- ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf
command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf
- Downloads last month
- 225
Hardware compatibility
Log In to add your hardware
4-bit
16-bit
Model tree for phate334/multilingual-e5-large-gguf
Base model
intfloat/multilingual-e5-large
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf phate334/multilingual-e5-large-gguf:F16# Run inference directly in the terminal: llama-cli -hf phate334/multilingual-e5-large-gguf:F16