fwiw it definitely is an improvement

#1
by SerialKicked - opened

v1 had, even in chat completion mode, both thinking and tool-calling broken, while in v2 it works just fine. I haven't played with the more creative side much, yet, but it seemed okay at it.

I'm kinda curious , any personal opinion on the "fine-tunability" of Q 3.5 so far?

Finetunes surprisingly well for a Qwen model IMO, only downside of the lack of nsfw data Qwen had in the pretrain.

Still working through finding the right hyperparams for this model though, it can be difficult to tell if you're overfitting or underfitting with this model so I'm still fiddling with params and slight variations of the datamix.

I think this was a bit overfit according to some other testers, although I personally found it OK. Although I'm probably a bit biased. Definitely the most interesting model to finetune at the moment though for me personally.

Yeah, gotcha. I understand it's still a bit early to be entirely sure, but that's good to hear, not sure I would have enjoyed another year of MS24B-only. πŸ˜„

v1 was definitely an overfit, not sure for v2 yet, it seems fine, but I didn't test much.

Anyway, thanks for the models, and good luck!

@zerofata why do you use, as base, the vanilla Qwen, instead of decensored ones? like ArliAI/Qwen3.5-27B-Derestricted or Bobi099/Qwen3.5-27B-heretic

it sounds like a better starting point to finetune a RP model on, ReadyArt/Omega-Evolution-27B-v1.0 did this too

I've got mixed feelings on abliteration even after all the improvements in the last few months made towards them.

I haven't yet seen an abliteration that was able to see the difference between in character refusals and assistant refusals. Personal opinion, but it's much easier to uncensor a model than it is to fix the damage caused by abliteration that has removed concept of "no" from a model, even the "better" ones that try to leave harmless prompts unchanged.

Same, I've probably tested hundreds of models at this point, they are always a downgrade. Abliteration has some use cases, but this ain't one of them.

Thanks you very much for your explanation, i never considered that

Personal opinion, but it's much easier to uncensor a model than it is to fix the damage caused by abliteration that has removed concept of "no" from a model, -

I'm curious - what techniques for uncensoring are you referring to outside of abliteration?

Finetuning / training the model primarily.

https://huggingface.co/dphn/Dolphin-Mistral-24B-Venice-Edition
https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v3

Two examples above of uncensoring via training the model. I believe Drummer did SFT for Big Tiger Gemma and Dolphin would've been SFT and probably some RLHF.

My models are also generally more uncensored than the original models I trained on, but it's more a side effect of me not training it on any refusals and giving it uncensored creative data. It'll happily do any NSFW type RP that the original model might've balked at, but because I didn't feed it (much) uncensored assistant data, when prompted outside of those creative scenarios it still knows how to refuse in the same way the original model did.

Abliteration is frequently getting better though, so it's quite possible a way is discovered to abliterate that allows for more fine-grained control of what behaviors you modify.

Oh btw @zerofata

I'd suggest to use this jinja template in future Qwen releases. It fixes a couple issues, and allows for system messages in the middle of the prompt, instead of crashing the frontend.

Sign up or log in to comment