Instructions to use Zeknes/Qwen3-VL-Reranker-8B-MLX-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use Zeknes/Qwen3-VL-Reranker-8B-MLX-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Qwen3-VL-Reranker-8B-MLX-4bit Zeknes/Qwen3-VL-Reranker-8B-MLX-4bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
| license: apache-2.0 | |
| library_name: mlx | |
| pipeline_tag: text-ranking | |
| base_model: | |
| - Qwen/Qwen3-VL-Reranker-8B | |
| tags: | |
| - mlx | |
| - multimodal rerank | |
| - qwen | |
| - reranker | |
| - 4-bit | |
| # Qwen3-VL-Reranker-8B-MLX-4bit | |
| This is the **MLX 4-bit quantized** version of [Qwen/Qwen3-VL-Reranker-8B](https://huggingface.co/Qwen/Qwen3-VL-Reranker-8B), optimized for Apple Silicon (Mac / iPad / iPhone) inference using the [MLX framework](https://github.com/ml-explore/mlx). | |
| ## Quantization Info | |
| | Config | Value | | |
| |--------|-------| | |
| | Bits | 4 | | |
| | Group Size | 64 | | |
| | Quantization Mode | Affine | | |
| | Dtype | bfloat16 | | |
| ## Model Overview | |
| - **Model Type**: MultiModal Reranker | |
| - **Supported Modalities**: Text, images, screenshots, videos, and arbitrary multimodal combinations | |
| - **Parameters**: 8B | |
| - **Context Length**: 32k | |
| - **Languages**: 30+ | |
| ## Requirements | |
| ```bash | |
| pip install mlx-lm transformers | |
| ``` | |
| ## Usage with `mlx-lm` | |
| ```python | |
| from mlx_lm import load | |
| model, tokenizer = load("Zeknes/Qwen3-VL-Reranker-8B-MLX-4bit") | |
| ``` | |
| For full usage examples (multimodal reranking, vLLM), please refer to the original model page: | |
| [Qwen3-VL-Reranker-8B](https://huggingface.co/Qwen/Qwen3-VL-Reranker-8B) | |
| ## Citation | |
| ```bibtex | |
| @article{qwen3vlembedding, | |
| title={Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking}, | |
| author={Li, Mingxin and Zhang, Yanzhao and Long, Dingkun and Chen Keqin and Song, Sibo and Bai, Shuai and Yang, Zhibo and Xie, Pengjun and Yang, An and Liu, Dayiheng and Zhou, Jingren and Lin, Junyang}, | |
| journal={arXiv preprint arXiv:2601.04720}, | |
| year={2026} | |
| } | |
| ``` | |