Hyper-AI 's Collections

gemma-4-fp8

fp8 quant for gemma-4 models, nearly half memory decrease, speedup 30%, vllm serve can run