Zen Embedding 8B GGUF

High-performance text embedding model based on Qwen3-Embedding-8B, optimized for efficient inference.

Downloads

Source URL
HuggingFace hf download zenlm/zen-embedding-8B-GGUF
Direct https://download.hanzo.ai/llm-models/zen-embedding-8B-Q4_K_M.gguf

Features

  • 100+ language support
  • #1 on MTEB multilingual leaderboard
  • Optimized for semantic search and retrieval
  • GGUF format for efficient CPU/GPU inference
  • Q4_K_M quantization (4.68 GB)

Usage

Works with llama.cpp and compatible inference engines.

License

Apache 2.0 (inherited from Qwen3-Embedding)

Downloads last month
75
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for zenlm/zen-embedding-8B-GGUF

Base model

Qwen/Qwen3-8B-Base
Quantized
(14)
this model