Spaces:

openra-rl
/

OpenRA-Bench

Running

App Files Files Community

OpenRA-Bench / README.md

yxc20098

Update docs: CLI submission, agent identity, replay downloads, API endpoints

3a2bab2 14 days ago

preview code

raw

history blame contribute delete

3.02 kB

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

metadata

title: OpenRA-Bench
emoji: 🎮
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: true
license: gpl-3.0

OpenRA-Bench

Standardized benchmark and leaderboard for AI agents playing Red Alert through OpenRA-RL.

Features

Leaderboard: Ranked agent comparison with composite scoring
Filtering: By agent type (Scripted/LLM/RL) and opponent difficulty
Evaluation harness: Automated N-game benchmarking with metrics collection
OpenEnv rubrics: Composable scoring (win/loss, military efficiency, economy)
Replay verification: Replay files linked to leaderboard entries

Quick Start

View the leaderboard

pip install -r requirements.txt
python app.py
# Opens at http://localhost:7860

Run an evaluation

# Against the HuggingFace-hosted environment (no Docker needed)
python evaluate.py \
    --agent scripted \
    --agent-name "MyBot-v1" \
    --opponent Normal \
    --games 10 \
    --server https://openra-rl-openra-rl.hf.space

# Or against a local Docker server
python evaluate.py \
    --agent scripted \
    --agent-name "MyBot-v1" \
    --opponent Normal \
    --games 10 \
    --server http://localhost:8000

Submit results

Via CLI (recommended):

pip install openra-rl
openra-rl bench submit result.json
openra-rl bench submit result.json --replay game.orarep --agent-name "MyBot" --agent-url "https://github.com/user/mybot"

Results from openra-rl play are auto-submitted after each game.

Via PR:

Fork this repo
Run evaluation (appends to data/results.csv)
Open a PR with your results

Agent identity

Customize your leaderboard entry:

Field	Description
`agent_name`	Display name (e.g. "DeathBot-9000")
`agent_type`	`Scripted`, `LLM`, or `RL`
`agent_url`	GitHub/project URL — renders as a clickable link on the leaderboard

Replay downloads

Entries submitted with a .orarep replay file show a download link in the Replay column. Replays are stored on the Space and served at /replays/<filename>.

API endpoints

The Gradio app exposes these API endpoints (Gradio 5+ SSE protocol):

Endpoint	Description
`submit`	Submit JSON results (no replay)
`submit_with_replay`	Submit JSON + replay file
`filter_leaderboard`	Query/filter leaderboard data

Scoring

Component	Weight	Description
Win Rate	50%	Games won / total games
Military Efficiency	25%	Kill/death cost ratio (normalized)
Economy	25%	Final asset value (normalized)