FP8 quant

#15
by Qnibbles - opened

Can we please get an FP8 version that'll also work with with fp8_e4m3 K/V cache of vLLM?

Sign up or log in to comment