JonathanMiddleton
/

Qwen3-Reranker-0.6B

Model card Files Files and versions

JonathanMiddleton commited on Aug 19, 2025

Commit

9a20733

·

verified ·

1 Parent(s): 846b21f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ revision: 602838d   # Aug 19 2025
 # Qwen3-Reranker-0.6B-GGUF
 **🚨 REQUIRED Llama.cpp build:** https://github.com/ngxson/llama.cpp/tree/xsn/qwen3_embd_rerank
-**This unmerged fix branch is mandatory** to run Qwen3 reranking models. Other HF GGUF quantizations of the reranker typically fail in mainline `llama.cpp` because they were not produced with this build. **This quantization was produced with the above build and works.**
 ## Purpose
 Multilingual **text-reranking** model in **GGUF** for efficient CPU/GPU inference with *llama.cpp*-compatible back-ends.

 # Qwen3-Reranker-0.6B-GGUF
 **🚨 REQUIRED Llama.cpp build:** https://github.com/ngxson/llama.cpp/tree/xsn/qwen3_embd_rerank
+**This unmerged fix branch is mandatory** to run Qwen3 reranking models. Other HF GGUF quantizations of the 0.6B reranker typically fail in mainline `llama.cpp` because they were not produced with this build. **This quantization was produced with the above build and works.**
 ## Purpose
 Multilingual **text-reranking** model in **GGUF** for efficient CPU/GPU inference with *llama.cpp*-compatible back-ends.