Due to the evaluations being done against all other new LLM models. You'll face the issue of exponential curve for the tokens used. ๐ธ๐ธ๐ธ
I'd suggest evaluating new ones only against the top 5 LLMs in win rate.
@RaphDab
ยท Sign up or log in to comment