ml-intern / eval

Commit History

feat: CLI local mode, slash commands, interrupt support; remove lmnr; frontend fixes
82b0c13

akseljoonas Claude Opus 4.6 commited on

fix: properly close SDK message on error, show tool errorText
b22c5f3

akseljoonas commited on

Revert "fix: show errorText for failed tools, bump eval max_iterations to 300"
f765eb4

akseljoonas commited on

fix: show errorText for failed tools, bump eval max_iterations to 300
b2846f6

akseljoonas commited on

functioning frontend and docker
ba93c86

akseljoonas commited on

generated, filled in and verfied 250 eval questions
7534b92

akseljoonas commited on

intermediate commit until i let amp loose
8bd1c22

akseljoonas commited on

eval readme update
a9d2c33

akseljoonas commited on

gpt 5 nano judge
0aa56ff

akseljoonas commited on

fixing tracing
b402135

akseljoonas commited on

adding claude code + mcp
eab219c

akseljoonas commited on

leaderboard and results
df3b181

akseljoonas commited on

updated eval
235ace7

akseljoonas commited on

dataset creation script
73d437d

akseljoonas commited on

adding readme
dc71e7b

akseljoonas commited on

adding hf datasets i/o
c1fac32

akseljoonas commited on

eval script done
bc84cfe

akseljoonas commited on

modified eval prompt
f050c81

akseljoonas commited on

thinking if we want eval or not
522a08c

akseljoonas commited on