BGE-M3 — 8-bit MLX

8-bit MLX quantization of BAAI/bge-m3, for Apple Silicon (~592 MB). Multilingual text-embedding model (1024-dim dense vectors).

Usage

pip install -U mlx-embeddings
from mlx_embeddings import load, generate

model, tokenizer = load("TyKaoz/bge-m3-8bit")
out = generate(model, tokenizer, texts=["Bonjour", "Hello", "你好"])
print(out.text_embeds.shape)  # (3, 1024)
Base Tool Precision Size
BAAI/bge-m3 mlx-embeddings 8-bit · group 64 ~592 MB

By TyKaoz — privacy-first native macOS LLM chat client. Apache 2.0, inherited from the base model.

Downloads last month
33
Safetensors
Model size
0.2B params
Tensor type
F16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TyKaoz/bge-m3-8bit

Base model

BAAI/bge-m3
Finetuned
(476)
this model