Exl Quants Please

#15
by rjmehta - opened

@turboderp @LoneStriker Exl quants please if possible but not sure if exllama supports gemma.

It doesn't, yet. 9B may be supported soon, though really it needs support in flash-attn to work correctly. 27B will not work at all without it.

Great discussion! For anyone wanting to quickly test this, Crazyrouter offers API access to this model. No infrastructure setup needed — just an API key and the standard OpenAI SDK.

Sign up or log in to comment