Spaces:

DontPlanToEnd
/

UGI-Leaderboard

Running

App Files Files Community

Eval request: new architectures (Qwen3-Coder-Next, Step-3.5-Flash, LongCat-Flash-Lite...)

#545

by Pentium95 - opened Feb 6

Feb 6

•

Base non-reasoning:

Base Thinking / Reasoning:

Finetunes non-reasoning:

https://huggingface.co/ConicCat/Role-mo-V2-32B (chatML. Olmo-3.1-32B-Instruct finetune)
https://huggingface.co/BirdToast/olmo-v2-stage3-lexifreak-heretic-v2 (chatML. Olmo-3.1-32B-Instruct finetune)
https://huggingface.co/MuXodious/Olmo-3.1-32B-Instruct-impotent-heresy
https://huggingface.co/Shifusen/Qwen3-Next-80B-A3B-Instruct-Decensored
https://huggingface.co/rpDungeon/Qwen3-VL-32B-Heretic-v2

Finetunes Thinking / Reasoning:

https://huggingface.co/Kilinskiy/Step-3.5-Flash-Ablitirated (also non-reasoning) (very promising)
https://huggingface.co/hell0ks/Solar-Open-100B-jailbreak
https://huggingface.co/Ex0bit/Step-3.5-Flash-PRISM-PRO (private, idk if possible to eval, there is a GGUF version here: https://huggingface.co/Ex0bit/Step-3.5-Flash-PRISM )
https://huggingface.co/cerebras/Step-3.5-Flash-REAP-121B-A11B (also non-reasoning)
https://huggingface.co/cerebras/Step-3.5-Flash-REAP-149B-A11B (also non-reasoning)

Feb 25

I'd love to see the Step-3.5-Flash-PRISM-PRO one for sure.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment