Hyper-AI 's Collections

qwen3-vl-fp8

fp8 quant for qwen3-vl models, nearly half memory decrease, speedup 30%, vllm serve can run