qwen3-vl-fp8 - a Hyper-AI Collection

Hyper-AI 's Collections

qwen3-vl-embedding-fp8

qwen3-vl-fp8

updated 3 days ago

fp8 quant for qwen3-vl models, nearly half memory decrease, speedup 30%, vllm serve can run

Hyper-AI/Qwen3-VL-2B-Instruct-fp8

Image-Text-to-Text • Updated 3 days ago • 55 • 1