Independent evaluation results

#78

by yaronr - opened Sep 26, 2024

Discussion

yaronr

Sep 26, 2024

Dear THUDM team,

I'm pleased to share our independent evaluation of the model using our implementation of the MMLU-Pro benchmark.

I hope you find this useful.

xujfcn

Feb 24

Great discussion! For anyone wanting to quickly test this, Crazyrouter offers API access to this model. No infrastructure setup needed — just an API key and the standard OpenAI SDK.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment