Commit History

Internal-feedback pass: rename to "Evaluation Cards", rework Summary view, simplify §4 metrics
faa9b3f
Running

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add Official/Community/All scope filter for developers; drop bar
4ba8d73

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Simplify interpretive signals heading
8494e4c

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-suite signals, sortable leaderboard, theme cleanup
0314721

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks
0b45710

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Match nested benchmarks in /evals search; auto-expand families with hits
26f932a

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Move reader-mode toggle to detail pages; theme banners + apples-to-apples
4629534

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Merge cross-source benchmark families; tidy leaderboard panel + table chrome
8ef4cbc

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Make /models tables column-sortable; rebalance /evals + /models toolbars
5a2d59c

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Move split selector below the reporting comparison heading
629a612

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Fix sort toggle direction and remove categories as sortable column
c9c5a30

evijit HF Staff Claude Sonnet 4.6 commited on

Sort evals list by family name; add sortable columns; use cleaned display names
919a75f

evijit HF Staff Claude Sonnet 4.6 commited on

Dedup logic to counts
aac276a

j-chim commited on

stats change
f816900

j-chim commited on

Switch family/model views to curated category tags
bc08b3b

evijit HF Staff commited on

Route peer-ranks fetch through SNAPSHOT_URL sidecar
6cc7b0b

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Hotfix: categories
a80dd9f

j-chim commited on

Group model/eval-detail benchmarks by hierarchy.json families
f073e7a

evijit HF Staff commited on

Drop latest_timestamp fallback for release_date display
8717cca

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Wrap /evals page in Suspense for useSearchParams
3df9dfd

j-chim Claude Opus 4.7 (1M context) commited on

Refactor to align on benchmark hierarchy
2ed4959

j-chim commited on

Remove unnecessary distinct() when reporting total results
2f8b51d

j-chim commited on

Update with datafix v2
11542d9

j-chim commited on

Consolidate hierarchy terminology + handle v2 hierarchy shape
350e866

evijit HF Staff commited on

Swap backend data (#3)
fe99ffa

evijit HF Staff j-chim commited on

Tighten eval cards UI and clean up stale local data
32864b0

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add new component files and align app to EvalEval design system
dbdd6d1

evijit HF Staff Claude Sonnet 4.6 commited on

Align user-facing labels with paper terminology
4be62f9

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Merge corpus dashboard into home as paper-aligned landing
5279156

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Fix Fibble Arena (and similar) suite link routing
c569d0f

j-chim Claude Opus 4.7 (1M context) commited on

Add DuckDB shadow-read backend with source-metadata fix
2fcae3f

Jenny Chim Claude Opus 4.7 (1M context) commited on

Separate policy and researcher views
9b4cdbb

evijit HF Staff commited on

Add interpretive signals, corpus dashboard, and slice browser
bca888a

evijit HF Staff Claude Opus 4.7 (1M context) commited on

improve ux
8058fce

evijit HF Staff commited on

Differentiate audience modes and tighten eval navigation
d8c2856

evijit HF Staff commited on

Aggregate setup aliases and clarify benchmark variants
dd0b4fc

evijit HF Staff commited on

Improve eval/model UX, lite data paths, and leaderboard clarity
436ada0

evijit HF Staff commited on

Improve homepage loading and eval grouping
26a0d2d

evijit HF Staff commited on

Add per-benchmark comparison histograms on model detail
415ac43

evijit HF Staff Claude Opus 4.6 (1M context) commited on

Add site favicon metadata
35729f5

evijit HF Staff commited on

Improve eval score displays and summary fallbacks
bd8cbe8

evijit HF Staff commited on

Harden aggregate evals and cache refresh
9d14977

evijit HF Staff commited on

Refine evaluation browsing UX
a0dd44e

evijit HF Staff commited on

Refresh eval cards UI and backend data flow
c1f2130

evijit HF Staff commited on

Fix survey submit: use correct HF commit API JSON format with files array
c5372a8

evijit HF Staff Claude Opus 4.6 (1M context) commited on

Fix survey submit: use multipart form data for HF commit API
9481599

evijit HF Staff Claude Opus 4.6 (1M context) commited on

Add alert feedback on survey submit success/failure
872607f

evijit HF Staff Claude Opus 4.6 (1M context) commited on

Fix HF commit API field: summary not commit_message
023694a

evijit HF Staff Claude Opus 4.6 (1M context) commited on

Fix survey submission: use HF commit API instead of deprecated upload
ddf16f4

evijit HF Staff Claude Opus 4.6 (1M context) commited on

Add survey submission and update survey text for public use
516ec04

evijit HF Staff Claude Opus 4.6 (1M context) commited on