Bundle should-refuse sweep data for Calibration tab b3d466f Running verified VibeCodingScientist commited on about 20 hours ago
Add Calibration tab: PC-tier scatter + TPR bars from should-refuse sweep 240e3ec verified VibeCodingScientist commited on about 20 hours ago
Redesign UI: theme-aware leaderboard, thesis-forward hero, cleaner cells 5eaec60 verified VibeCodingScientist commited on about 20 hours ago
Redesign leaderboard: two-row header, heatmap tints, progress bars, rank column, intro blurb 49bc134 verified VibeCodingScientist commited on about 21 hours ago
Fix sdk_version to 5.50.0 (exact release required by HF) 09df299 verified VibeCodingScientist commited on about 21 hours ago
Deploy RefusalBench leaderboard (v1.1-frozen, arXiv:2605.21545) 3b68594 verified VibeCodingScientist commited on about 21 hours ago
Deploy RefusalBench leaderboard (v1.1-frozen, arXiv:2605.21545) ab29e65 verified VibeCodingScientist commited on about 21 hours ago