Spaces:
Running
Running
Commit History
Add rule-based policy-mode summaries for model & eval views aacebd7
Cross-source dedup, plotbox polish, pretty URLs, eval page fallbacks 0b45710
Auto-purge sidecar bucket when Next.js BUILD_ID changes e9dae58
Bump clean-hierarchy cache version to v13 to drop stale blob 4d3de5c
Merge cross-source benchmark families; tidy leaderboard panel + table chrome 8ef4cbc
Drop alias-only single-bench families without merging them cb0db40
Restore curated benchmark families; polish frontier panel UX ca20f78
Precompute eval matrices for multi-metric + per-slice leaderboards 553b175
Restore HF Open LLM v2 composite and dedup vals.ai aliases 6db4f51
Add local parquet read support aa29970
Sort evals list by family name; add sortable columns; use cleaned display names 919a75f
Dedup logic to counts aac276a
Compute and apply cleaned benchmark counts per model c2e86ea
Remove raw-hierarchy fallback — only ever serve cleaned hierarchy b5fa10d
Harden cleanHierarchy fallback and add family-name filter chips 8529a4b
Bump clean-hierarchy cache version to v10 to bust stale HF Space cache 4bf0591
Restructure model details + extend cleanHierarchy for split families and aggregator dedup 06313c1
Add option to purge cache f2e3a0a
stats change f816900
Prefer /data persistent bucket for sidecar cache when available dc95237
Disk-cache snapshot sidecars to skip cold-start re-downloads 40339dc
Switch family/model views to curated category tags bc08b3b
Route peer-ranks fetch through SNAPSHOT_URL sidecar 6cc7b0b
Hotfix: categories a80dd9f
Group model/eval-detail benchmarks by hierarchy.json families f073e7a
Refactor to align on benchmark hierarchy 2ed4959
Update with datafix v2 11542d9
Tighten eval cards UI and clean up stale local data 32864b0
Merge corpus dashboard into home as paper-aligned landing 5279156
Deploy DuckDB-backed frontend to da8db3e
Jenny Chim commited on
Add DuckDB shadow-read backend with source-metadata fix 2fcae3f
Jenny Chim Claude Opus 4.7 (1M context) commited on