Commit History

eval: submit GPQA Diamond result via PR (community badge) (#1)
d050408

terry-u commited on

docs: reflect value 0.01
b551049
verified

terry-u commited on

docs: reflect value 0.01
88e0e32
verified

terry-u commited on

chore: bump GPQA eval value to 0.01 (placeholder, not measured)
e42a3cf
verified

terry-u commited on

docs: update LEADERBOARD with eval_results decision
2b80e41
verified

terry-u commited on

docs: reflect .eval_results addition in README
55da13b
verified

terry-u commited on

feat: add .eval_results/gpqa.yaml (GPQA Diamond, value 0 / not evaluated)
562b037
verified

terry-u commited on

docs: GPQA Diamond 리더보드 노출 (우리 모델 0점/미평가, 타 모델 출처 실측치)
5c5bb35
verified

terry-u commited on

docs: GPQA Diamond 리더보드 노출 (우리 모델 0점/미평가, 타 모델 출처 실측치)
c3f5e55
verified

terry-u commited on

Upload folder using huggingface_hub
c834750
verified

terry-u commited on

initial commit
7f645b8
verified

terry-u commited on