Salesforce
/

codet5p-16b

text2text-generation

Model card Files Files and versions

Onnx-version or Compatability to T5forConditionalGeneration

#1

by michaelfeil - opened May 20, 2023

•

edited May 20, 2023

Looking forward to convert this model to a faster version for accelerated inference. (2B, 6B, 16B)
Options:

Ctranslate2: Support for all architectures such as T5, mT5, GPT-J, GPT-2,.. As with codet5p-770m-py, this runs now at high speed and 1320MiB cuda footprint, batch inference which I think is awesome. https://huggingface.co/michaelfeil/ct2fast-codet5p-770m-py -> Any way to convert this to a T5 architecture?
Onnx -> ORT or Nvidia TensorRT -> CodeT5pModuleConfig has no Onnx implementation, e.g. see Codegen2

Any advice?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment