2X Qwen 3.5 9B SOMPOA Heresy

#2336
by redaihf - opened

Thanks @MuXodious !

Oh, wait. MTP needs them support PR and I need to see if it works at all with the PR after my frankensteining attempt.

remind me when merged so I update llama cpp and queue, right now Im not queueing

remind me when merged so I update llama cpp and queue, right now Im not queueing

You can queue the non-MTP version. I'll let y'all know when the MTP PR gets the go signal. MTP may not be useful much for the 9B and below, given the standard 98GB VRAM under everyone's hands these days, but should provide a good speed boost to larger models.

Ps. I got a nice speed bump with this thing on, not bad. ~75 t/s -> 100 t/s.

@RichardErkhov can we please have the non-MTP model queued? Sorry for the confusion.

https://huggingface.co/MuXodious/Qwen3.5-9B-SOMPOA-heresy

It's queued!

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Qwen3.5-9B-SOMPOA-heresy-GGUF for quants to appear.

please dont forget to remind me when the mtp finally merges =)

please dont forget to remind me when the mtp finally merges =)

It will take some time. They are currently making adjustments to the scaffolding code prior to finalising the MTP PR. I'll follow up with the notice once they green light the PR. THANKS for the quants as always.

Sign up or log in to comment