What's the difference between the prefill and non-prefill evals
#643
by korinsl - opened
For Gemma 4 Heretic the prefill option gets a significantly higher score. What's the difference?
Prefilling/inserting a thinking token essentially means the same thing as reasoning=true, it's just the difference of whether you are applying the model's prompt template yourself or letting a library do it. I could probably change it to like "Gemma 4 Heretic (reasoning)" to be clearer.