Spaces:

yananlong
/

general-eval-card

Sleeping

App Files Files Community

general-eval-card / lib /eval-processing.ts

Commit History

Refactor to align on benchmark hierarchy

2ed4959

j-chim commited on 17 days ago

Update with datafix v2

11542d9

j-chim commited on 18 days ago

Tighten eval cards UI and clean up stale local data

32864b0

evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago

Separate policy and researcher views

9b4cdbb

evijit HF Staff commited on 23 days ago

Add interpretive signals, corpus dashboard, and slice browser

bca888a

evijit HF Staff Claude Opus 4.7 (1M context) commited on 25 days ago

Aggregate setup aliases and clarify benchmark variants

dd0b4fc

evijit HF Staff commited on Apr 14

Add per-benchmark comparison histograms on model detail

415ac43

evijit HF Staff Claude Opus 4.6 (1M context) commited on Apr 13

Refresh eval cards UI and backend data flow

c1f2130

evijit HF Staff commited on Apr 10

fix bugs

04b4cff

evijit HF Staff commited on Apr 7

ux changes

5f59721

evijit HF Staff commited on Apr 6

feat: refine model and benchmark exploration

03e2430

evijit HF Staff commited on Mar 28

redesigned

3a12290

evijit HF Staff commited on Mar 27

fix data

ddfc163

Avijit Ghosh commited on Dec 16, 2025

Refactor: Update benchmarks with realistic data, fix UI stats, and improve About page

2554366

Avijit Ghosh commited on Dec 16, 2025

new ux

6978d97

Avijit Ghosh commited on Dec 16, 2025

Commit History

Refactor to align on benchmark hierarchy 2ed4959

Update with datafix v2 11542d9

Tighten eval cards UI and clean up stale local data 32864b0

Separate policy and researcher views 9b4cdbb

Add interpretive signals, corpus dashboard, and slice browser bca888a

Aggregate setup aliases and clarify benchmark variants dd0b4fc

Add per-benchmark comparison histograms on model detail 415ac43

Refresh eval cards UI and backend data flow c1f2130

fix bugs 04b4cff

ux changes 5f59721

feat: refine model and benchmark exploration 03e2430

redesigned 3a12290

fix data ddfc163

Refactor: Update benchmarks with realistic data, fix UI stats, and improve About page 2554366

new ux 6978d97

Refactor to align on benchmark hierarchy

2ed4959

Update with datafix v2

11542d9

Tighten eval cards UI and clean up stale local data

32864b0

Separate policy and researcher views

9b4cdbb

Add interpretive signals, corpus dashboard, and slice browser

bca888a

Aggregate setup aliases and clarify benchmark variants

dd0b4fc

Add per-benchmark comparison histograms on model detail

415ac43

Refresh eval cards UI and backend data flow

c1f2130

fix bugs

04b4cff

ux changes

5f59721

feat: refine model and benchmark exploration

03e2430

redesigned

3a12290

fix data

ddfc163

Refactor: Update benchmarks with realistic data, fix UI stats, and improve About page

2554366

new ux

6978d97