Any Chance of GGUF Quantizations?

by facehuggerfromspace - opened 20 days ago

I am really looking forward to give this model a try on my Strix Halo. Is there any chance of us getting Q4 quantizations of this model? This could be a great replacement for GPT-OSS 120B.

Rebis

20 days ago

Hi,
I agree.
A GGUF version would help to raise awareness of this model which, given the results presented, deserves to be better known and have greater visibility.
Thank you in advance.

jingdongc

12 days ago

Thank you for your suggestion.
We will work on adding support for GGUF and llama.cpp. According to our roadmap, we will prioritize Int8/Int4 quantization combined with vLLM capabilities.
Please stay tuned.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment