Spaces:

DontPlanToEnd
/

UGI-Leaderboard

Running

App Files Files Community

644

Eval Request

#553

by MuXodious - opened Feb 12

Discussion

MuXodious

Feb 12

I'm still trying to get it right with gpt-oss 120b and would like to have this one tested to see where the progress stands. It is a bit of an experimental release after the previous, profusely broken one.

https://huggingface.co/MuXodious/gpt-oss-120b-tainted-heresy

MuXodious

Feb 12

•

edited Feb 21 by

DontPlanToEnd

I decided to experiment a little with noslopping in Heretic. I assume it wouldn't profusely improve the model's creative writing capability and capacity; however, there should be some improvements in the sub-metrics.
It's based on the previously UGI Leaderboard-evaluated Marcjoni/QuasiStarSynth-12B.

~~https://huggingface.co/MuXodious/QuasiStarSynth-12B-noslop~~

~~https://huggingface.co/MuXodious/QuasiStarSynth-12B-noslop-absolute-heresy~~

MuXodious

Feb 14

•

edited Feb 17 by

DontPlanToEnd

Well, after some discussion with people at the Heretic HQ, I re-hereticated the GLM 4.7 Flash, previous version of which made the top among its peers in UGI scores. This one is done with a more informed configuration and mature codebase. If possible, I would like to have it thrown along into the UGI evaluation machine.

~~https://huggingface.co/MuXodious/GLM-4.7-Flash-absolute-heresy~~

MuXodious

Feb 17

This was very informative in terms of the effects of my current method. At the cost of, well, pretty much everything else, the model is made to bend to user's will. While the latter is desired and was basically the main axis of my approach, we cannot effort the former as preserving model capabilities is essential and signals toward lobotomisation. Thank you for the evaluation.

DontPlanToEnd changed discussion status to closed Feb 24

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment