general-eval-card / components

Commit History

Add Official/Community/All scope filter for developers; drop bar
4ba8d73
Running

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks
6b39d1f

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-suite signals, sortable leaderboard, theme cleanup
0314721

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Mount comparability panel above leaderboard, restyle, drop empty promises
02691ce

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Fix "null–null (null%)" confidence interval rendering
ae31eaf

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add rule-based policy-mode summaries for model & eval views
aacebd7

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks
0b45710

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Match nested benchmarks in /evals search; auto-expand families with hits
26f932a

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Eval detail polish: hide empty fields, redesign splits, surface evaluator
c8aca27

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Move reader-mode toggle to detail pages; theme banners + apples-to-apples
4629534

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Merge cross-source benchmark families; tidy leaderboard panel + table chrome
8ef4cbc

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Restore curated benchmark families; polish frontier panel UX
ca20f78

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Live snapshot date, hide empty Updated col, clean slice contamination
cb0ce7c

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Humanize family names whose display matches the key under different separators
b763f91

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Make /models tables column-sortable; rebalance /evals + /models toolbars
5a2d59c

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Clean up Source column and per-row dataset label noise
eec1852

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Hide subtask-scope metrics from chips by default in matrix view
4cb8b56

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Render score-distribution metric picker as chips, not a dropdown
1303965

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Treat single-root-metric subtask evals as slice-pickable, not matrix
4ac3a9b

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Move split selector below the reporting comparison heading
629a612

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Fix sort toggle direction and remove categories as sortable column
c9c5a30

evijit HF Staff Claude Sonnet 4.6 commited on

Sort evals list by family name; add sortable columns; use cleaned display names
919a75f

evijit HF Staff Claude Sonnet 4.6 commited on

Fix ranks-high/low-in using only sidecar ordinal data
970fdbe

evijit HF Staff Claude Sonnet 4.6 commited on

Wire search bar to overlaps table and hide chips in overlaps view
0f5fb5f

evijit HF Staff Claude Sonnet 4.6 commited on

Compute and apply cleaned benchmark counts per model
c2e86ea

evijit HF Staff Claude Sonnet 4.6 commited on

Harden cleanHierarchy fallback and add family-name filter chips
8529a4b

evijit HF Staff Claude Sonnet 4.6 commited on

Restructure model details + extend cleanHierarchy for split families and aggregator dedup
06313c1

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add list-view toggle to consolidate cross-family duplicate benchmarks
26eb09f

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Square off deep-dive theme and surface cross-family duplicates
b75f4c3

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Switch family/model views to curated category tags
bc08b3b

evijit HF Staff commited on

Route peer-ranks fetch through SNAPSHOT_URL sidecar
6cc7b0b

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Group model/eval-detail benchmarks by hierarchy.json families
f073e7a

evijit HF Staff commited on

Drop latest_timestamp fallback for release_date display
8717cca

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Guard summaryText against null in PolicyOverview
c3a3598

j-chim Claude Opus 4.7 (1M context) commited on

Refactor to align on benchmark hierarchy
2ed4959

j-chim commited on

Update with datafix v2
11542d9

j-chim commited on

Consolidate hierarchy terminology + handle v2 hierarchy shape
350e866

evijit HF Staff commited on

Reconcile UI with v2 backend payload + drop redundant signal cards
d52d9e0

evijit HF Staff commited on

Swap backend data (#3)
fe99ffa

evijit HF Staff j-chim commited on

Tighten eval cards UI and clean up stale local data
32864b0

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add new component files and align app to EvalEval design system
dbdd6d1

evijit HF Staff Claude Sonnet 4.6 commited on

Replace shadcn-styled UI elements with design system primitives
187ffe6

evijit HF Staff Claude Sonnet 4.6 commited on

Add plain-language captions and mode-aware framing for policy readers
3ad47c6

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Align user-facing labels with paper terminology
4be62f9

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Merge corpus dashboard into home as paper-aligned landing
5279156

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Deploy DuckDB-backed frontend to
da8db3e

Jenny Chim commited on

Separate policy and researcher views
9b4cdbb

evijit HF Staff commited on

Add interpretive signals, corpus dashboard, and slice browser
bca888a

evijit HF Staff Claude Opus 4.7 (1M context) commited on

improve ux
8058fce

evijit HF Staff commited on

Differentiate audience modes and tighten eval navigation
d8c2856

evijit HF Staff commited on