Hyper-AI 's Collections

qwen3.5-fp8

fp8 quant for qwen3.5 models, nearly half memory decrease, speedup 30%, vllm serve can run