Commit History

Use slash-form path for in-page model navigation (drop %2F from URL)
73cce97

j-chim commited on

Adapt upstream category-based code to v2 tag taxonomy
bfb71af

j-chim commited on

Merge remote-tracking branch 'origin/main' into merge/main-into-v2-cleanup
6e90b4d

j-chim commited on

WIP: v2 cleanup checkpoint before merging origin/main
d249d5b

j-chim commited on

Add per-model + per-eval OpenGraph thumbnails; replace "EE/EC" mark with logo image
c315a26

evijit HF Staff Claude Opus 4.7 (1M context) commited on

view-data: coerce BIGINT-encoded numeric strings; restore distribution embed route
d49f850

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Untrack accidentally-committed noise files; gitignore the patterns
445ce35

evijit HF Staff Claude Opus 4.7 (1M context) commited on

view-data: paranoid CAST every output column to a primitive type
b6dab21

evijit HF Staff Claude Opus 4.7 (1M context) commited on

view-data: also wrap VARCHAR[] and TIMESTAMP columns; add app/error.tsx boundary
93c113e

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Homepage: drop duplicate "Evaluation Cards · Beta" hero kicker; accent corpus date
abb939f

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Embeds: histogram route, leaderboard slices + sort, brand mark; cross-source row dedup
7a54021

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Internal-feedback pass: rename to "Evaluation Cards", rework Summary view, simplify §4 metrics
faa9b3f

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add Official/Community/All scope filter for developers; drop bar
4ba8d73

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Simplify interpretive signals heading
8494e4c

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-suite signals, sortable leaderboard, theme cleanup
0314721

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks
0b45710

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Match nested benchmarks in /evals search; auto-expand families with hits
26f932a

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Move reader-mode toggle to detail pages; theme banners + apples-to-apples
4629534

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Merge cross-source benchmark families; tidy leaderboard panel + table chrome
8ef4cbc

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Make /models tables column-sortable; rebalance /evals + /models toolbars
5a2d59c

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Move split selector below the reporting comparison heading
629a612

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Fix sort toggle direction and remove categories as sortable column
c9c5a30

evijit HF Staff Claude Sonnet 4.6 commited on

Sort evals list by family name; add sortable columns; use cleaned display names
919a75f

evijit HF Staff Claude Sonnet 4.6 commited on

Dedup logic to counts
aac276a

j-chim commited on

stats change
f816900

j-chim commited on

Switch family/model views to curated category tags
bc08b3b

evijit HF Staff commited on

Route peer-ranks fetch through SNAPSHOT_URL sidecar
6cc7b0b

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Hotfix: categories
a80dd9f

j-chim commited on

Group model/eval-detail benchmarks by hierarchy.json families
f073e7a

evijit HF Staff commited on

Drop latest_timestamp fallback for release_date display
8717cca

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Wrap /evals page in Suspense for useSearchParams
3df9dfd

j-chim Claude Opus 4.7 (1M context) commited on

Refactor to align on benchmark hierarchy
2ed4959

j-chim commited on

Remove unnecessary distinct() when reporting total results
2f8b51d

j-chim commited on

Update with datafix v2
11542d9

j-chim commited on

Consolidate hierarchy terminology + handle v2 hierarchy shape
350e866

evijit HF Staff commited on

Swap backend data (#3)
fe99ffa

evijit HF Staff j-chim commited on

Tighten eval cards UI and clean up stale local data
32864b0

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Add new component files and align app to EvalEval design system
dbdd6d1

evijit HF Staff Claude Sonnet 4.6 commited on

Align user-facing labels with paper terminology
4be62f9

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Merge corpus dashboard into home as paper-aligned landing
5279156

evijit HF Staff Claude Opus 4.7 (1M context) commited on

Fix Fibble Arena (and similar) suite link routing
c569d0f

j-chim Claude Opus 4.7 (1M context) commited on

Add DuckDB shadow-read backend with source-metadata fix
2fcae3f

Jenny Chim Claude Opus 4.7 (1M context) commited on

Separate policy and researcher views
9b4cdbb

evijit HF Staff commited on

Add interpretive signals, corpus dashboard, and slice browser
bca888a

evijit HF Staff Claude Opus 4.7 (1M context) commited on

improve ux
8058fce

evijit HF Staff commited on

Differentiate audience modes and tighten eval navigation
d8c2856

evijit HF Staff commited on

Aggregate setup aliases and clarify benchmark variants
dd0b4fc

evijit HF Staff commited on

Improve eval/model UX, lite data paths, and leaderboard clarity
436ada0

evijit HF Staff commited on

Improve homepage loading and eval grouping
26a0d2d

evijit HF Staff commited on

Add per-benchmark comparison histograms on model detail
415ac43

evijit HF Staff Claude Opus 4.6 (1M context) commited on