ewald1976/Pantutu-24b-1.0

#2344

by ewald1976 - opened 1 day ago

Discussion

ewald1976

1 day ago

Thank you very much!
ewald1976/Pantutu-24b-1.0
❤️

ewald1976 changed discussion status to closed 1 day ago

simonko912

1 day ago

Sorry either config.json doesnt exist or is malformed, so i cant queue it )=

simonko912

1 day ago

And its alredy quanitized so i cant queue it

nicoboss

1 day ago

It's already a GGUF and @ewald1976 uploaded it in F16 precision so we could process it if desired. Not sure if we should do so as he closed the model request but if you want us to do it, we can do so. Sorry @simonko912 your tool unfortunately won't be able to queue already quantized models as they require special handling.

ewald1976

1 day ago

If it is possible to quantize my F16 i would be very glad. If not, then excuse me and i will add the missing parts.
Thank you all.

ewald1976 changed discussion status to open 1 day ago

simonko912

1 day ago

If it is possible to quantize my F16 i would be very glad. If not, then excuse me and i will add the missing parts.
Thank you all.

As "quanitize" a gguf, you cant but you can make lower quality versiosn (8 bit, etc) i dont know about imatrix, but it would be easier if you added the safetensors (and make sure its in the root of the repo)

nicoboss

1 day ago

@simonko912 F16 GGUF will work almost as good as SafeTensors. We simply skip the first step of converting SafeTensors to the source GGUF and directly use his F16 GGUF as source GGUF. The precision lost from BF16 vs F16 is so tiny we used F16 as our source GGUF even for BF16 models until around a year ago. imatrix and everything will work perfectly fine when quantizing from his F16 GGUF. It really is just the tool I wrote for you not currently supporting it.

nicoboss

1 day ago

Pantutu-24B-v0.3-f16.gguf is now getting downloaded to rich1 and I will manually push a quantization job for it to rich1 as soon the download is compilated.

simonko912

1 day ago

@simonko912 F16 GGUF will work almost as good as SafeTensors. We simply skip the first step of converting SafeTensors to the source GGUF and directly use his F16 GGUF as source GGUF. The precision lost from BF16 vs F16 is so tiny we used F16 as our source GGUF even for BF16 models until around a year ago. imatrix and everything will work perfectly fine when quantizing from his F16 GGUF. It really is just the tool I wrote for you not currently supporting it.

i was not sure about the imatrix ones, but if it works it works, and thats okay

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment