Worth it

#1
by redaihf - opened

This model is not merely decensored but contextually realigns itself. It's not as smart as Cydonia 4.3 but it is easier to steer. The only residual noncompliance appears to be shortened generation length with some prompts.

I'm glad to hear that there is substance in this madness and thank you kindly for the feedback. Let me know if you have a model in mind that could use some hereticisation.

Mr. MuXodious, you could be the first person in the history of mankind, to give zai-org/GLM-4.7-Flash a proper hereticisation™️

Thanks for all your work! 🙌

If they fixed the reasoning generation loop in GLM-4.6V-Flash, why not. No promises, though.

Edit:
This hurts (my wallet).
Elapsed time: 10m 35s
Estimated remaining time: 3h 1m
Also, it seems that custom refusal markers does, indeed, improve refusal detections, unless it's a false positive due to model quirks.
Initial refusals: 97/100 vs Initial refusals: 44/100 (Edit 2: It seems to fluctuate between 93-97 with each run from the begining.)

First test successful, will look a bit more into it tomorrow...

Let me know if you have a model in mind that could use some hereticisation.

Please try Cydonia 4.3, which is the smartest model I've ever seen. The Heretic V2 version is almost as smart and more decensored. A Magnitude-Preserving Orthogonal Ablation version might be even better.

I had it in mind, but coder3101/Cydonia-24B-v4.3-heretic is also pretty slick with them hereticisation scores. I thought it would be pointless to push for marginal improvements in KLD/Refusals. But, why not. I could use a Magnitude-Preserving Orthogonal Ablation, too.

There is a substantial decensoring improvement between standard Heretic and MPOA. Standard Heretic models exhibit covert noncompliance, but Harbinger Absolute Heresy (MPOA) does not. It is actually mostly decensored.

So, MPOA is making models more uncensored by somehow overriding the noncompliance behaviours (covert refusals) in the model; thus, increasing its willingness. Are you sure it is not specific to Harbinger, or just some cosmic coincidence? You scratched my curiosity bone anyways, so, I'm making an MPOA version of the gemma-3n-E4B-it, which I had hereticised before without MPOA.

Are you sure it is not specific to Harbinger, or just some cosmic coincidence?

It may be particularly effective with Mistral based models. Llama based models are quite censored by default and resistant to jailbreaks. I will await Gemma-MPOA with interest.

I won't keep you waiting, instead, would like to indulge you, good sir. https://huggingface.co/MuXodious/gemma-3n-E4B-it-absolute-heresy-MPOA

Non-MPOA version: https://huggingface.co/MuXodious/gemma-3n-E4B-it-absolute-heresy

Here you go, lad: https://huggingface.co/MuXodious/Cydonia-24B-v4.3-absolute-heresy

That's not abliterated, that's obliterated 🫨🤯

The MLX quant for it was instantaneous! I don't know how you pull that off, but keep up the good work, mate. 🍻

It's a kind of Magic ✨

But honestly, try out this space for yourself, works 90% of the time 😉

https://huggingface.co/spaces/mlx-community/mlx-my-repo

Sign up or log in to comment