ewald1976/Pantutu-24b-1.0

#2344
by ewald1976 - opened

Thank you very much!
ewald1976/Pantutu-24b-1.0
❤️

ewald1976 changed discussion status to closed

Sorry either config.json doesnt exist or is malformed, so i cant queue it )=

And its alredy quanitized so i cant queue it

It's already a GGUF and @ewald1976 uploaded it in F16 precision so we could process it if desired. Not sure if we should do so as he closed the model request but if you want us to do it, we can do so. Sorry @simonko912 your tool unfortunately won't be able to queue already quantized models as they require special handling.

If it is possible to quantize my F16 i would be very glad. If not, then excuse me and i will add the missing parts.
Thank you all.

ewald1976 changed discussion status to open

If it is possible to quantize my F16 i would be very glad. If not, then excuse me and i will add the missing parts.
Thank you all.

As "quanitize" a gguf, you cant but you can make lower quality versiosn (8 bit, etc) i dont know about imatrix, but it would be easier if you added the safetensors (and make sure its in the root of the repo)

@simonko912 F16 GGUF will work almost as good as SafeTensors. We simply skip the first step of converting SafeTensors to the source GGUF and directly use his F16 GGUF as source GGUF. The precision lost from BF16 vs F16 is so tiny we used F16 as our source GGUF even for BF16 models until around a year ago. imatrix and everything will work perfectly fine when quantizing from his F16 GGUF. It really is just the tool I wrote for you not currently supporting it.

Pantutu-24B-v0.3-f16.gguf is now getting downloaded to rich1 and I will manually push a quantization job for it to rich1 as soon the download is compilated.

@simonko912 F16 GGUF will work almost as good as SafeTensors. We simply skip the first step of converting SafeTensors to the source GGUF and directly use his F16 GGUF as source GGUF. The precision lost from BF16 vs F16 is so tiny we used F16 as our source GGUF even for BF16 models until around a year ago. imatrix and everything will work perfectly fine when quantizing from his F16 GGUF. It really is just the tool I wrote for you not currently supporting it.

i was not sure about the imatrix ones, but if it works it works, and thats okay

Sign up or log in to comment