lingoly-too / leaderboard.csv
jkhouja's picture
Update leaderboard.csv
e840038 verified
Model,Provider,Type,Baseline score,Obfuscated score
Aya 23 35B,Cohere,Open source,0.10654349746757057,0.057081801
Claude 3.5 Sonnet,Anthropic,Closed source,0.48255271180599657,0.2810140963355337
Claude 3.7 Sonnet,Anthropic,Closed source,0.604975,0.428881
GPT 4.5,OpenAI,Closed source,0.4208265195574057,0.2545024812218498
GPT 4o,OpenAI,Closed source,0.31371291749661456,0.1563339989919302
Gemini 1.5 Pro,Google,Closed source,0.3690345167304693,0.20461522579355207
Llama 3.3 70B-Instruct,Meta,Open source,0.11452795751175084,0.082131188
Phi4,Microsoft,Open source,0.1809802769595679,0.10996628714372364
DeepSeek R1,DeepSeek,Open source,0.3965527162895584,0.2649618642615188
o1-preview,OpenAI,Closed source,0.47730527712315257,0.3222020975619888
o3-mini (high),OpenAI,Closed source,0.42172257807447155,0.3059086523804619
o3-mini (low),OpenAI,Closed source,0.249751,0.122204
Gemini 2.5 Pro,Google,Closed source,0.589055,0.423539
GPT-5,OpenAI,Closed source,0.609,0.467
Claude Opus 4.1,Anthropic,Closed source,0.592,0.458
DeepSeek-V3.1-Terminus,DeepSeek,Open source,0.554,0.422