Category Error
#3
by CO-IR - opened
Regarding the benchmark metrics, HLE(humanity last exam) should belong to higher-order reasoning, not the code domain.
Fixed.
Regarding the benchmark metrics, HLE(humanity last exam) should belong to higher-order reasoning, not the code domain.
Fixed.