OpenRA-Bench / README.md
yxc20098's picture
Update docs: CLI submission, agent identity, replay downloads, API endpoints
3a2bab2

A newer version of the Gradio SDK is available: 6.9.0

Upgrade
metadata
title: OpenRA-Bench
emoji: 🎮
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: true
license: gpl-3.0

OpenRA-Bench

Standardized benchmark and leaderboard for AI agents playing Red Alert through OpenRA-RL.

Features

  • Leaderboard: Ranked agent comparison with composite scoring
  • Filtering: By agent type (Scripted/LLM/RL) and opponent difficulty
  • Evaluation harness: Automated N-game benchmarking with metrics collection
  • OpenEnv rubrics: Composable scoring (win/loss, military efficiency, economy)
  • Replay verification: Replay files linked to leaderboard entries

Quick Start

View the leaderboard

pip install -r requirements.txt
python app.py
# Opens at http://localhost:7860

Run an evaluation

# Against the HuggingFace-hosted environment (no Docker needed)
python evaluate.py \
    --agent scripted \
    --agent-name "MyBot-v1" \
    --opponent Normal \
    --games 10 \
    --server https://openra-rl-openra-rl.hf.space

# Or against a local Docker server
python evaluate.py \
    --agent scripted \
    --agent-name "MyBot-v1" \
    --opponent Normal \
    --games 10 \
    --server http://localhost:8000

Submit results

Via CLI (recommended):

pip install openra-rl
openra-rl bench submit result.json
openra-rl bench submit result.json --replay game.orarep --agent-name "MyBot" --agent-url "https://github.com/user/mybot"

Results from openra-rl play are auto-submitted after each game.

Via PR:

  1. Fork this repo
  2. Run evaluation (appends to data/results.csv)
  3. Open a PR with your results

Agent identity

Customize your leaderboard entry:

Field Description
agent_name Display name (e.g. "DeathBot-9000")
agent_type Scripted, LLM, or RL
agent_url GitHub/project URL — renders as a clickable link on the leaderboard

Replay downloads

Entries submitted with a .orarep replay file show a download link in the Replay column. Replays are stored on the Space and served at /replays/<filename>.

API endpoints

The Gradio app exposes these API endpoints (Gradio 5+ SSE protocol):

Endpoint Description
submit Submit JSON results (no replay)
submit_with_replay Submit JSON + replay file
filter_leaderboard Query/filter leaderboard data

Scoring

Component Weight Description
Win Rate 50% Games won / total games
Military Efficiency 25% Kill/death cost ratio (normalized)
Economy 25% Final asset value (normalized)

Links