FP8-Dynamic Version for vLLM

#5
by drguolai - opened

Can we have a FP8-Dynamic version for vLLM for faster inference?
Thanks!

Hi @drguolai ,

Thanks for reaching out . We have forwarded your request for an FP8-Dynamic version of the google/translategemma-27b-it model to the concerned team.

need to be able to run on vLLM, thanks

Sign up or log in to comment