HuggingFaceH4/ultrachat_200k
Viewer • Updated • 515k • 67.9k • 728
Qwen3 4b reranker full vllm adapted 🚀
This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed using llm-compressor with the following scheme: W4A16
python3 -m vllm.entrypoints.openai.api_server --model 'dolfsai/Qwen3-Reranker-4B-seq-cls-vllm-W4A16' --task classify
Important: You MUST read the following guide for correct usage of this model here Guide