openhands commited on
Commit
b6f00ad
·
1 Parent(s): 721acc6

Update About page structure

Browse files

- Combine Benchmarks and Scoring into 'Benchmark Details'
- Change 'Raw Data' to 'Resources' with links to:
- OpenHands (main repo)
- Software Agent SDK (agent code)
- Benchmarks (benchmarking code)
- Results (raw evaluation data)
- Update citation URL to https://index.openhands.dev

Files changed (1) hide show
  1. about.py +10 -19
about.py CHANGED
@@ -14,10 +14,10 @@ def build_page():
14
  )
15
  gr.Markdown("---", elem_classes="divider-line")
16
 
17
- # --- Section 2: Benchmarks ---
18
  gr.HTML(
19
  """
20
- <h2>Benchmarks</h2>
21
  <p>We evaluate agents across five categories:</p>
22
  <ul class="info-list">
23
  <li><strong>Issue Resolution:</strong> <a href="https://www.swebench.com/" target="_blank">SWE-bench</a></li>
@@ -27,35 +27,26 @@ def build_page():
27
  <li><strong>Information Gathering:</strong> <a href="https://huggingface.co/gaia-benchmark" target="_blank">GAIA</a></li>
28
  </ul>
29
  <p>
30
- All models are evaluated using the <a href="https://github.com/OpenHands/software-agent-sdk" target="_blank">OpenHands Software Agent SDK</a>.
31
  </p>
32
  """
33
  )
34
  gr.Markdown("---", elem_classes="divider-line")
35
 
36
- # --- Section 3: Scoring ---
37
  gr.HTML(
38
  """
39
- <h2>Scoring</h2>
40
  <ul class="info-list">
41
- <li><strong>Average score:</strong> Macro-average across benchmarks (equal weighting)</li>
42
- <li><strong>Cost:</strong> USD per task; agents without cost data shown separately in plots</li>
 
 
43
  </ul>
44
  """
45
  )
46
  gr.Markdown("---", elem_classes="divider-line")
47
 
48
- # --- Section 4: Raw Data ---
49
- gr.HTML(
50
- """
51
- <h2>Raw Data</h2>
52
- <p>
53
- All evaluation results are available at <a href="https://github.com/OpenHands/openhands-index-results" target="_blank">github.com/OpenHands/openhands-index-results</a>.
54
- </p>
55
- """
56
- )
57
- gr.Markdown("---", elem_classes="divider-line")
58
-
59
  # --- Section 5: Contact ---
60
  gr.HTML(
61
  """
@@ -89,7 +80,7 @@ def build_page():
89
  title={OpenHands Index: A Comprehensive Leaderboard for AI Coding Agents},
90
  author={OpenHands Team},
91
  year={2025},
92
- howpublished={https://huggingface.co/spaces/OpenHands/openhands-index}
93
  }</pre>
94
  """
95
  )
 
14
  )
15
  gr.Markdown("---", elem_classes="divider-line")
16
 
17
+ # --- Section 2: Benchmark Details ---
18
  gr.HTML(
19
  """
20
+ <h2>Benchmark Details</h2>
21
  <p>We evaluate agents across five categories:</p>
22
  <ul class="info-list">
23
  <li><strong>Issue Resolution:</strong> <a href="https://www.swebench.com/" target="_blank">SWE-bench</a></li>
 
27
  <li><strong>Information Gathering:</strong> <a href="https://huggingface.co/gaia-benchmark" target="_blank">GAIA</a></li>
28
  </ul>
29
  <p>
30
+ <strong>Scoring:</strong> Average score is a macro-average across benchmarks (equal weighting). Cost is USD per task; agents without cost data are shown separately in plots.
31
  </p>
32
  """
33
  )
34
  gr.Markdown("---", elem_classes="divider-line")
35
 
36
+ # --- Section 3: Resources ---
37
  gr.HTML(
38
  """
39
+ <h2>Resources</h2>
40
  <ul class="info-list">
41
+ <li><a href="https://github.com/OpenHands/OpenHands" target="_blank">OpenHands</a> - The main OpenHands repository</li>
42
+ <li><a href="https://github.com/OpenHands/software-agent-sdk" target="_blank">Software Agent SDK</a> - The agent code used for evaluation</li>
43
+ <li><a href="https://github.com/OpenHands/benchmarks" target="_blank">Benchmarks</a> - The benchmarking code</li>
44
+ <li><a href="https://github.com/OpenHands/openhands-index-results" target="_blank">Results</a> - Raw evaluation results</li>
45
  </ul>
46
  """
47
  )
48
  gr.Markdown("---", elem_classes="divider-line")
49
 
 
 
 
 
 
 
 
 
 
 
 
50
  # --- Section 5: Contact ---
51
  gr.HTML(
52
  """
 
80
  title={OpenHands Index: A Comprehensive Leaderboard for AI Coding Agents},
81
  author={OpenHands Team},
82
  year={2025},
83
+ howpublished={https://index.openhands.dev}
84
  }</pre>
85
  """
86
  )