Spaces:

Foaster
/

Werewolf_benchmark

Running

New models = Exponential curve

by Pendrokar - opened Oct 17, 2025

Due to the evaluations being done against all other new LLM models. You'll face the issue of exponential curve for the tokens used. 💸💸💸

I'd suggest evaluating new ones only against the top 5 LLMs in win rate.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment