Spaces:
Running
Running
Commit History
Show ACP agent results in the leaderboard (#11) 6d3b657
Hide incomplete entries from home page charts by default 0b126a8
openhands commited on
fix: Add Visualization column to main table (not just benchmark tables) 5f628a6
openhands openhands commited on
feat: Add Visualization column for Laminar eval links dfa8bfc
openhands openhands commited on
Widen Logs column to prevent vertical stacking of download icons 9cab912
openhands openhands commited on
Add download icons to Logs column for benchmark results 67867ec
openhands openhands commited on
Format date column to show only date, not time 2b7cd27
openhands commited on
Fix: Preserve mark_by selection during periodic data refresh df058f7
openhands openhands commited on
Add 'Mark systems by' selector for scatter plot icons (Company/Openness/Country) ed6e90d
openhands openhands commited on
Clean up unused code, files, and assets bb0cd90
openhands commited on
Only show 'Detailed Benchmark Results' when more than one benchmark exists d17eff0
openhands commited on
Connect 'Show only open models' checkbox to Winners and Evolution sections 49f9739
openhands commited on
Winners by Category: put scores before names 24ff7a3
openhands openhands commited on
Use emojis instead of images in Winners by Category table 63c73f3
openhands openhands commited on
Refactor Winners by Category to single unified table 4f4eb00
openhands openhands commited on
Add Winners by Category section to main page c14a283
openhands openhands commited on
Add 'Show only open models' checkbox filter 5cdf97c
openhands openhands commited on
Add runtime column and Cost/Performance + Runtime/Performance charts to all pages 2854ddd
openhands openhands commited on
Move Download column to benchmark-specific tables only 4d0ae13
openhands openhands commited on
Add Download column for trajectory archives and increase table font size b5317d7
openhands openhands commited on
Add timer-based auto-refresh for leaderboard data 974f31f
openhands commited on
Move 'Show incomplete entries' checkbox above plot and apply filter to both 361b5c2
openhands commited on
Add periodic cache refresh for leaderboard data 6bddf26
openhands commited on
Update DeepSeek logo, tooltip format, and category names 5778893
openhands openhands commited on
Fix table icons layout and add Qwen/MiniMax logos 72b86cb
openhands openhands commited on
UI cleanup and About page updates 6737ff3
openhands openhands commited on
Multiple graph and table improvements fcb3d0b
openhands openhands commited on
Replace open/closed model distinction with lock emojis in tables 8a3a9eb
openhands openhands commited on
Remove open/closed distinction from graph, use company logos as data points b6ec318
openhands openhands commited on
Add company logos to graphs and tables, label frontier points with model names 800e404
openhands openhands commited on
Replace total_cost with cost_per_instance (average cost per instance) b1f3e49
openhands openhands commited on
fix: Column naming and incomplete entries toggle 4ab5f97
openhands openhands commited on
feat: Update leaderboard calculations and add incomplete entries toggle 5998027
openhands openhands commited on
Fix UI score formatting: do not coerce NaN to 0; rely on format_score_column to show 'Not Submitted'.\n\nCo-authored-by: openhands <openhands@all-hands.dev> c68aa7d
openhands commited on
Fix data plotting requirements and server port handling; ensure per-benchmark plots use correct agent column.\n\n- Respect HOST/PORT env for local runs\n- Use 'OpenHands Version' in plot requirements\n- Avoid plotting when use_plotly=False\n\nCo-authored-by: openhands <openhands@all-hands.dev> fb3d0db
openhands commited on
Remove unused AstaBench category files and update UI to OpenHands categories 6a0d1cb
openhands commited on
Fix score calculation to match AstaBench methodology and update categories e734bf6
openhands commited on
Swap column order and fix duplicate column warnings 3781804
openhands openhands commited on
Rename columns: Agent→OpenHands Version, Models Used→Language Model, remove Submitter 376500e
openhands openhands commited on
Force rebuild: Update comment 64c8899
openhands commited on
Fix missing get_combined_icon_html reference in main page ae0bf64
openhands commited on
Simplify leaderboard to open vs closed models only ca754bb
openhands commited on
Fix runtime errors and data loading 66fcacd
openhands commited on
Convert to JSONL data format and remove agent-eval dependency 1027cfb
openhands openhands commited on
Initial OpenHands Index leaderboard based on ASTA Bench 085a012
openhands openhands commited on