fwiw it definitely is an improvement

by SerialKicked - opened 27 days ago

v1 had, even in chat completion mode, both thinking and tool-calling broken, while in v2 it works just fine. I haven't played with the more creative side much, yet, but it seemed okay at it.

I'm kinda curious , any personal opinion on the "fine-tunability" of Q 3.5 so far?

zerofata

Owner 27 days ago

Finetunes surprisingly well for a Qwen model IMO, only downside of the lack of nsfw data Qwen had in the pretrain.

Still working through finding the right hyperparams for this model though, it can be difficult to tell if you're overfitting or underfitting with this model so I'm still fiddling with params and slight variations of the datamix.

I think this was a bit overfit according to some other testers, although I personally found it OK. Although I'm probably a bit biased. Definitely the most interesting model to finetune at the moment though for me personally.

SerialKicked

27 days ago

Yeah, gotcha. I understand it's still a bit early to be entirely sure, but that's good to hear, not sure I would have enjoyed another year of MS24B-only. 😄

v1 was definitely an overfit, not sure for v2 yet, it seems fine, but I didn't test much.

Anyway, thanks for the models, and good luck!

Pentium95

25 days ago

@zerofata why do you use, as base, the vanilla Qwen, instead of decensored ones? like ArliAI/Qwen3.5-27B-Derestricted or Bobi099/Qwen3.5-27B-heretic

it sounds like a better starting point to finetune a RP model on, ReadyArt/Omega-Evolution-27B-v1.0 did this too

zerofata

Owner 25 days ago

I've got mixed feelings on abliteration even after all the improvements in the last few months made towards them.

I haven't yet seen an abliteration that was able to see the difference between in character refusals and assistant refusals. Personal opinion, but it's much easier to uncensor a model than it is to fix the damage caused by abliteration that has removed concept of "no" from a model, even the "better" ones that try to leave harmless prompts unchanged.

SerialKicked

25 days ago

•

edited 25 days ago

Same, I've probably tested hundreds of models at this point, they are always a downgrade. Abliteration has some use cases, but this ain't one of them.

Pentium95

25 days ago

Thanks you very much for your explanation, i never considered that

GeoMaciolek

25 days ago

Personal opinion, but it's much easier to uncensor a model than it is to fix the damage caused by abliteration that has removed concept of "no" from a model, -

I'm curious - what techniques for uncensoring are you referring to outside of abliteration?

zerofata

Owner 25 days ago

•

edited 25 days ago

Finetuning / training the model primarily.

https://huggingface.co/dphn/Dolphin-Mistral-24B-Venice-Edition
https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v3

Two examples above of uncensoring via training the model. I believe Drummer did SFT for Big Tiger Gemma and Dolphin would've been SFT and probably some RLHF.

My models are also generally more uncensored than the original models I trained on, but it's more a side effect of me not training it on any refusals and giving it uncensored creative data. It'll happily do any NSFW type RP that the original model might've balked at, but because I didn't feed it (much) uncensored assistant data, when prompted outside of those creative scenarios it still knows how to refuse in the same way the original model did.

Abliteration is frequently getting better though, so it's quite possible a way is discovered to abliterate that allows for more fine-grained control of what behaviors you modify.

SerialKicked

23 days ago

•

edited 23 days ago

Oh btw @zerofata

I'd suggest to use this jinja template in future Qwen releases. It fixes a couple issues, and allows for system messages in the middle of the prompt, instead of crashing the frontend.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment