Add Official/Community/All scope filter for developers; drop bar 4ba8d73 Running evijit HF Staff Claude Opus 4.7 (1M context) commited on 17 days ago
Ignore local whisker-render.mjs probe script b02b887 evijit HF Staff Claude Opus 4.7 (1M context) commited on 17 days ago
Simplify interpretive signals heading 8494e4c evijit HF Staff Claude Opus 4.7 (1M context) commited on 17 days ago
Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks 6b39d1f evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Cross-suite signals, sortable leaderboard, theme cleanup 0314721 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Mount comparability panel above leaderboard, restyle, drop empty promises 02691ce evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Fix "null–null (null%)" confidence interval rendering ae31eaf evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Add rule-based policy-mode summaries for model & eval views aacebd7 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks 0b45710 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Match nested benchmarks in /evals search; auto-expand families with hits 26f932a evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Eval detail polish: hide empty fields, redesign splits, surface evaluator c8aca27 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Move reader-mode toggle to detail pages; theme banners + apples-to-apples 4629534 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Auto-purge sidecar bucket when Next.js BUILD_ID changes e9dae58 evijit HF Staff commited on 18 days ago
Bump clean-hierarchy cache version to v13 to drop stale blob 4d3de5c evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Merge cross-source benchmark families; tidy leaderboard panel + table chrome 8ef4cbc evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Drop alias-only single-bench families without merging them cb0db40 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Restore curated benchmark families; polish frontier panel UX ca20f78 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Live snapshot date, hide empty Updated col, clean slice contamination cb0ce7c evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Humanize family names whose display matches the key under different separators b763f91 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Make /models tables column-sortable; rebalance /evals + /models toolbars 5a2d59c evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Clean up Source column and per-row dataset label noise eec1852 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Hide subtask-scope metrics from chips by default in matrix view 4cb8b56 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Render score-distribution metric picker as chips, not a dropdown 1303965 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Treat single-root-metric subtask evals as slice-pickable, not matrix 4ac3a9b evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Precompute eval matrices for multi-metric + per-slice leaderboards 553b175 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Restore HF Open LLM v2 composite and dedup vals.ai aliases 6db4f51 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Move split selector below the reporting comparison heading 629a612 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Fix sort toggle direction and remove categories as sortable column c9c5a30 evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Sort evals list by family name; add sortable columns; use cleaned display names 919a75f evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Fix ranks-high/low-in using only sidecar ordinal data 970fdbe evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Wire search bar to overlaps table and hide chips in overlaps view 0f5fb5f evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Compute and apply cleaned benchmark counts per model c2e86ea evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Remove raw-hierarchy fallback — only ever serve cleaned hierarchy b5fa10d evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Harden cleanHierarchy fallback and add family-name filter chips 8529a4b evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Bump clean-hierarchy cache version to v10 to bust stale HF Space cache 4bf0591 evijit HF Staff Claude Sonnet 4.6 commited on 18 days ago
Restructure model details + extend cleanHierarchy for split families and aggregator dedup 06313c1 evijit HF Staff Claude Opus 4.7 (1M context) commited on 18 days ago
Add list-view toggle to consolidate cross-family duplicate benchmarks 26eb09f evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago
Square off deep-dive theme and surface cross-family duplicates b75f4c3 evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago
Prefer /data persistent bucket for sidecar cache when available dc95237 evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago
Disk-cache snapshot sidecars to skip cold-start re-downloads 40339dc evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago
Route peer-ranks fetch through SNAPSHOT_URL sidecar 6cc7b0b evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago
Group model/eval-detail benchmarks by hierarchy.json families f073e7a evijit HF Staff commited on 19 days ago
Drop latest_timestamp fallback for release_date display 8717cca evijit HF Staff Claude Opus 4.7 (1M context) commited on 19 days ago
Guard summaryText against null in PolicyOverview c3a3598 j-chim Claude Opus 4.7 (1M context) commited on 19 days ago