Commit History

Fix frontier labels to use log10 coordinates for log scale axis
8cdce51

openhands openhands commited on

Method C: Use domain coordinates for layout images with log scale
4a4a7b5

openhands openhands commited on

Method B: Use raw data values for layout images (revert log transformation)
46ff970

openhands openhands commited on

Method A: Fix logo positions for log scale x-axis
7b4a3a1

openhands openhands commited on

Enable log scale for x-axis (cost) in graphs
5d32e7b

openhands openhands commited on

Trigger rebuild
8d4cfda

openhands commited on

Update intro text to focus on motivation rather than metrics
369c590

openhands openhands commited on

Fix table icons layout and add Qwen/MiniMax logos
72b86cb

openhands openhands commited on

UI cleanup and About page updates
6737ff3

openhands openhands commited on

Fix graph alignment issues
af81bcf

openhands openhands commited on

Multiple graph and table improvements
fcb3d0b

openhands openhands commited on

Replace open/closed model distinction with lock emojis in tables
8a3a9eb

openhands openhands commited on

Remove open/closed distinction from graph, use company logos as data points
b6ec318

openhands openhands commited on

Add company logos to graphs and tables, label frontier points with model names
800e404

openhands openhands commited on

Replace total_cost with cost_per_instance (average cost per instance)
b1f3e49

openhands openhands commited on

Fix TypeError when summing costs with None values
cdd40ba

openhands openhands commited on

Remove Test Set/Validation Set tabs, keep single results view
b8aea20

openhands openhands commited on

fix: Update Total cost description in intro to be sum, not average
6a5c447

openhands openhands commited on

docs: Update descriptive text to use Average Score and Total Cost
bb0f7af

openhands openhands commited on

fix: Column naming and incomplete entries toggle
4ab5f97

openhands openhands commited on

feat: Update leaderboard calculations and add incomplete entries toggle
5998027

openhands openhands commited on

feat: Add open_weights to openness mapping
55da48c

openhands openhands commited on

feat: Use pydantic schema models from openhands-index-results for validation
f0339f3

openhands openhands commited on

Fix: Handle pd.NA values in calculate_attempted function
28554f6

openhands openhands commited on

Update fallback category mappings: place SWE-Bench Multimodal under 'Frontend Development' and Swt-Bench under 'Test Generation'.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
b42a4fe

openhands commited on

Move commit0 to 'App Creation' category in fallback mappings.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
b16f7da

openhands commited on

Fix UI score formatting: do not coerce NaN to 0; rely on format_score_column to show 'Not Submitted'.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
c68aa7d

openhands commited on

Fix score formatting to avoid coercing NaN to 0; show 'Not Submitted' instead.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
5d82fab

openhands commited on

Fix data plotting requirements and server port handling; ensure per-benchmark plots use correct agent column.\n\n- Respect HOST/PORT env for local runs\n- Use 'OpenHands Version' in plot requirements\n- Avoid plotting when use_plotly=False\n\nCo-authored-by: openhands <openhands@all-hands.dev>
fb3d0db

openhands commited on

CRITICAL FIX: Add fallback category mappings for data without agenteval.json
b4ac443

openhands openhands commited on

Add debug logging to track data loading on HuggingFace Space
044cdf4

openhands openhands commited on

Force rebuild: Trigger HuggingFace Space to fetch latest data from GitHub
8be216f

openhands openhands commited on

Fix Categories Attempted calculation to handle missing category columns correctly
0718569

openhands commited on

Remove unused AstaBench category files and update UI to OpenHands categories
6a0d1cb

openhands commited on

Add Acknowledgements section crediting AstaBench
737a3f2

openhands commited on

Fix score calculation to match AstaBench methodology and update categories
e734bf6

openhands commited on

Fix agent_version display and make Overall Score bold
0e14c25

openhands commited on

Swap column order and fix duplicate column warnings
3781804

openhands openhands commited on

Rename columns: Agent→OpenHands Version, Models Used→Language Model, remove Submitter
376500e

openhands openhands commited on

Cleanup codebase: remove unused code, simplify data loading, and add pre-release notice
855423e

openhands openhands commited on

Force rebuild: Update comment
64c8899

openhands commited on

Fix missing get_combined_icon_html reference in main page
ae0bf64

openhands commited on

Simplify leaderboard to open vs closed models only
ca754bb

openhands commited on

Remove submission page and add submission instructions to About page
701c496

openhands openhands commited on

Add category workflow diagrams for 5 OpenHands Index categories
6324e5d

openhands openhands commited on

Update data loader to support agent-centric directory structure
e003f7b

openhands openhands commited on

Update UI: All-Hands-AI color scheme, agent version column names, and OpenHands logo
0ee2099

openhands openhands commited on

Fix category cost calculation - add category-level aggregation
5e9c3b9

openhands openhands commited on

Update category structure to OpenHands Index with 5 categories
c56f232

openhands openhands commited on

Update categories to 5 software engineering domains
0db3899

openhands openhands commited on