Add DuckDB shadow-read backend with source-metadata fix 2fcae3f Jenny Chim Claude Opus 4.7 (1M context) commited on 29 days ago
Add interpretive signals, corpus dashboard, and slice browser bca888a evijit HF Staff Claude Opus 4.7 (1M context) commited on 29 days ago
Preserve evaluator_relationship when flattening model hierarchy 431b0cc evijit HF Staff commited on Apr 15
Fix RewardBench2 key normalization for matrix leaderboard routing 8821e18 evijit HF Staff commited on Apr 14
Improve eval/model UX, lite data paths, and leaderboard clarity 436ada0 evijit HF Staff commited on Apr 14
Add per-benchmark comparison histograms on model detail 415ac43 evijit HF Staff Claude Opus 4.6 (1M context) commited on Apr 13
Add survey submission and update survey text for public use 516ec04 evijit HF Staff Claude Opus 4.6 (1M context) commited on Apr 8
Refactor: Update benchmarks with realistic data, fix UI stats, and improve About page 2554366 Avijit Ghosh commited on Dec 16, 2025