Spaces:
Running
Running
Commit History
Tighten eval cards UI and clean up stale local data 32864b0
Integrate with test backend data 7635aee
Add new component files and align app to EvalEval design system dbdd6d1
Replace shadcn-styled UI elements with design system primitives 187ffe6
Add plain-language captions and mode-aware framing for policy readers 3ad47c6
Align user-facing labels with paper terminology 4be62f9
Merge corpus dashboard into home as paper-aligned landing 5279156
Deploy DuckDB-backed frontend to da8db3e
Jenny Chim commited on
Separate policy and researcher views 9b4cdbb
Add interpretive signals, corpus dashboard, and slice browser bca888a
improve ux 8058fce
Differentiate audience modes and tighten eval navigation d8c2856
Aggregate setup aliases and clarify benchmark variants dd0b4fc
Improve eval/model UX, lite data paths, and leaderboard clarity 436ada0
Improve homepage loading and eval grouping 26a0d2d
Add per-benchmark comparison histograms on model detail 415ac43
Improve eval score displays and summary fallbacks bd8cbe8
Refine evaluation browsing UX a0dd44e
Refresh eval cards UI and backend data flow c1f2130
fix bugs 29afc21
fix bugs ae1dc39
fix bugs 04b4cff
ux changes 5f59721
Add survey e7123f0
fix: align reporting cues and developer slugs 5ca5561
feat: refine model and benchmark exploration 03e2430
redesigned 3a12290
rename benchmark to eval 0eafde7
Avijit Ghosh commited on
added better design and nav flow a328136
Avijit Ghosh commited on
fix data 8e60123
Avijit Ghosh commited on
fix data ddfc163
Avijit Ghosh commited on
add disclaimer cb623e2
Avijit Ghosh commited on
Refactor: Update benchmarks with realistic data, fix UI stats, and improve About page 2554366
Avijit Ghosh commited on
new ux 6978d97
Avijit Ghosh commited on