lemer-bk / results

Commit History

docs: add MMLU-Pro best-per-category table (46.4%), 5-shot results
cc4b6e4

Snider Virgil commited on

data: add Lemer(1,1) full 14-category results with transcripts
4afef24

Snider commited on

data: add benchmark result stock-e2b-bf16-math-think-temp1.json
835b8a2
verified

lthn commited on

data: add benchmark result stock-e2b-bf16-math-temp0.json
86673fd
verified

lthn commited on

data: add benchmark result stock-e2b-bf16-math-nothink-temp1.json
4e3885c
verified

lthn commited on

data: add benchmark result stock-e2b-bf16-math-nothink-temp0.json
1a9b96a
verified

lthn commited on

data: add benchmark result stock-e2b-bf16-biology-temp0.json
fa15fbe
verified

lthn commited on

data: add benchmark result lemer-bf16-math-think-temp1.json
8af8909
verified

lthn commited on

data: add benchmark result lemer-bf16-math-temp0.json
ccb779a
verified

lthn commited on

data: add benchmark result lemer-bf16-math-nothink-temp1.json
29f9886
verified

lthn commited on

data: add benchmark result lemer-bf16-math-nothink-temp0.json
a740c0b
verified

lthn commited on

data: add benchmark result lemer-bf16-biology-temp0.json
54d7fb0
verified

lthn commited on