GGUF
#5
by enricos - opened
I wanted to ask whether you have planned inference exclusively through the Transformers library, or whether the model architecture is supported, either directly or indirectly, by llama.cpp.
Would it make sense to convert the files to GGUF for this purpose?
Grazie
Hi @enricos ,
The model is expected to be compatible with GGUF for use with llama.cpp, although we don’t have anything more to share regarding an official release at the moment.
That said, a community-made GGUF version is already available here:
https://huggingface.co/robertobissanti/EngGPT2-16B-A3B-GGUF