FP8-Dynamic Version for vLLM
#5
by
drguolai - opened
Can we have a FP8-Dynamic version for vLLM for faster inference?
Thanks!
Hi @drguolai ,
Thanks for reaching out . We have forwarded your request for an FP8-Dynamic version of the google/translategemma-27b-it model to the concerned team.
need to be able to run on vLLM, thanks