The scores for K2-thinking and MiMo-V2-Flash were mixed up

#7
by StarryZhang - opened

Hi,

In the benchmark scores you reported, K2-thinking’s SWE-bench Verified score is 73.4. However, this score currently only appears in the results reported for MiMo-V2-Flash. I believe this is a mistake :)

image

image

image

Thank you for pointing this out. We’ll fix it shortly.

K2-thinking: Thank you for the high score given by our friend :p

Sign up or log in to comment