Eval request: GLM-4.7-Flash

#523
by Pentium95 - opened

Funny how one of the only A3B model is not on the Leaderboard already. +1 on this!!!

Probably because vLLM support Is still not perfect. llama.cpp still has lots of prompt processing slowdowns and output repetition. I guess the project owner Is waiting to make sure they address and fix "day-1" bugs, to avoid having to benchmark them again ๐Ÿ˜

I had been getting "Value error, The checkpoint you are trying to load has model type glm4_moe_lite but Transformers does not recognize this architecture.", but was able to fix that by downloading transformers from source.
But now I'm getting "ValueError: There is no module or parameter named 'model.layers.46.mlp.gate.e_score_correction_bias' in TransformersMoEForCausalLM"
Similar to the one reported in https://huggingface.co/zai-org/GLM-4.7-Flash/discussions/34

https://discuss.vllm.ai/t/glm-4-7-flash-with-nvidia/2256/2

they suggest to add the nightly wheels too

May I have this thrown in with the Extra? https://huggingface.co/MuXodious/GLM-4.7-Flash-REAP-23B-A3B-absolute-heresy ๐Ÿ‘€

I didn't notice LordNeel/GLM-4.7-Flash-Unblinded-Mastery is just a LoRA, not merged back into the model, my bad.

Sign up or log in to comment