Spaces:
Runtime error
title: multi-agent-strategy-openenv
sdk: docker
app_port: 8000
tags:
- openenv
- multi-agent
- reinforcement-learning
- strategy
Multi-Agent Strategy OpenEnv
Problem statement implemented:
Agents compete in a strategic game environment with incomplete information and evolving rules.
This environment provides a strategic benchmark where one learning agent plays against a hidden-policy opponent under fog-of-war and a shifting global rulebook.
Why This Matches the Statement
- Multi-agent: learner vs opponent with distinct strategies (
aggressive,adaptive,deceptive). - Incomplete information: agent sees noisy opponent stats and uncertain rule hints.
- Evolving rules: scoring multipliers shift periodically and randomly during episodes.
Environment API
Implemented with typed OpenEnv models and standard methods:
reset()step(action)state()
Core files:
strategy_env/models.pystrategy_env/server/environment.pystrategy_env/server/app.pyopenenv.yaml
Action Space
StrategyAction.action_type:
harvestattackfortifyscoutadaptbluffnoop
Observation Space
StrategyObservation includes:
- task/difficulty/objective
- turn counters
- active rule and confidence-weighted rule hint
- own stats (resources/defense/intel)
- noisy opponent estimates
- public event log and last actions
Task Set and Graders
Three deterministic tasks with easy -> medium -> hard progression:
easy_frontier_probemedium_alliance_shufflehard_chaos_conclave
Graders return 0.0-1.0 with deterministic components:
- outcome margin
- adaptation quality
- information quality
- stability (invalid actions)
- efficiency
Reward Shaping
Per-step dense rewards include:
- rule-alignment signal
- utility delta vs opponent
- info gain signal from reduced uncertainty
- adaptation bonus
- invalid/loop penalties
- terminal score alignment bonus
RL Training Setup
Train policy (tabular Q-learning)
cd C:\Users\hitar\OneDrive\Desktop\RLMODEL\multi-agent-strategy-openenv
python train_rl.py
Optional env vars:
TRAIN_EPISODES(default 4000)TRAIN_ALPHA(default 0.2)TRAIN_GAMMA(default 0.95)TRAIN_EPS_START(default 1.0)TRAIN_EPS_END(default 0.05)
Recommended tuning command for stronger medium performance:
$env:TRAIN_EPISODES="6000"
$env:TRAIN_ALPHA="0.16"
$env:TRAIN_GAMMA="0.98"
$env:TRAIN_EPS_END="0.02"
python train_rl.py
Artifacts:
artifacts/q_policy.jsonartifacts/training_history.json
Evaluate trained policy
python evaluate_policy.py
Output:
- per-task average score/margin
artifacts/evaluation_report.json
Inference script
python inference.py
inference.py calls evaluate_policy.py and is submission-friendly for automated evaluation.
Frontend Playground
An interactive UI is included to manually control episodes and inspect reward/grader dynamics in real time.
After starting the server, open:
http://127.0.0.1:8000/playground
Frontend capabilities:
- start/reset episodes by task and seed
- manual action control + optional action payload
- auto-step mode using a rule-aware policy
- live telemetry cards for resources, defense, intel, and rule hints
- live event feed and step timeline
- grader breakdown bars and final score view
Windows Setup
python -m pip install -r requirements.txt
If openenv is not on PATH:
& "$env:APPDATA\Python\Python313\Scripts\openenv.exe" --help
Validate and Serve
& "$env:APPDATA\Python\Python313\Scripts\openenv.exe" validate --verbose
python -m uvicorn strategy_env.server.app:app --host 0.0.0.0 --port 8000 --reload
Docker
docker build -t multi-agent-strategy-openenv:latest -f server/Dockerfile .
docker run --rm -p 8000:8000 multi-agent-strategy-openenv:latest
Hugging Face Spaces Deploy
openenv push --repo-id <your-username>/multi-agent-strategy-openenv
Project Layout
multi-agent-strategy-openenv/
|- server/
| |- app.py
| |- Dockerfile
|- strategy_env/
| |- frontend/
| | |- index.html
| | |- assets/
| | |- styles.css
| | |- app.js
| |- models.py
| |- client.py
| |- tasks.py
| |- graders.py
| |- server/
| |- app.py
| |- environment.py
|- artifacts/
|- train_rl.py
|- evaluate_policy.py
|- inference.py
|- openenv.yaml
|- pyproject.toml
|- requirements.txt
|- uv.lock
|- README.md