Voodisss
/

Qwen3-Reranker-8B-GGUF-llama_cpp

Model card Files Files and versions

Voodisss commited on Mar 10

Commit

4764de3

·

verified ·

1 Parent(s): a23888f

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -17,6 +17,13 @@ Working GGUF of [Qwen/Qwen3-Reranker-8B](https://huggingface.co/Qwen/Qwen3-Reran
 > **Other sizes:** [0.6B](https://huggingface.co/Voodisss/Qwen3-Reranker-0.6B-GGUF-llama_cpp) · [4B](https://huggingface.co/Voodisss/Qwen3-Reranker-4B-GGUF-llama_cpp) · [8B (this)](https://huggingface.co/Voodisss/Qwen3-Reranker-8B-GGUF-llama_cpp)
 ## Does it work?
 Yes. Most community GGUFs of Qwen3-Reranker produce garbage scores (`4.5e-23`) because they're missing reranker-specific tensors. See [llama.cpp #16407](https://github.com/ggml-org/llama.cpp/issues/16407). This one works:

 > **Other sizes:** [0.6B](https://huggingface.co/Voodisss/Qwen3-Reranker-0.6B-GGUF-llama_cpp) · [4B](https://huggingface.co/Voodisss/Qwen3-Reranker-4B-GGUF-llama_cpp) · [8B (this)](https://huggingface.co/Voodisss/Qwen3-Reranker-8B-GGUF-llama_cpp)
+## Available files
+| File                        | Quant | Size     | Description                                        |
+| --------------------------- | ----- | -------- | -------------------------------------------------- |
+| `Qwen3-Reranker-8B-F16.gguf`  | F16   | 14.10 GB | Full precision, no quality loss                    |
+| `Qwen3-Reranker-8B-Q8_0.gguf` | Q8_0  | 7.49 GB  | 8-bit quantized, half the size |
 ## Does it work?
 Yes. Most community GGUFs of Qwen3-Reranker produce garbage scores (`4.5e-23`) because they're missing reranker-specific tensors. See [llama.cpp #16407](https://github.com/ggml-org/llama.cpp/issues/16407). This one works: