Spaces:

Avnishjain
/

multi-agent-strat

Runtime error

App Files Files Community

multi-agent-strat / README.md

Avnishjain

Upload 21 files

6888575 verified 8 days ago

preview code

raw

history blame contribute delete

4.56 kB

metadata

title: multi-agent-strategy-openenv
sdk: docker
app_port: 8000
tags:
  - openenv
  - multi-agent
  - reinforcement-learning
  - strategy

Multi-Agent Strategy OpenEnv

Problem statement implemented:

Agents compete in a strategic game environment with incomplete information and evolving rules.

This environment provides a strategic benchmark where one learning agent plays against a hidden-policy opponent under fog-of-war and a shifting global rulebook.

Why This Matches the Statement

Multi-agent: learner vs opponent with distinct strategies (aggressive, adaptive, deceptive).
Incomplete information: agent sees noisy opponent stats and uncertain rule hints.
Evolving rules: scoring multipliers shift periodically and randomly during episodes.

Environment API

Implemented with typed OpenEnv models and standard methods:

reset()
step(action)
state()

Core files:

strategy_env/models.py
strategy_env/server/environment.py
strategy_env/server/app.py
openenv.yaml

Action Space

StrategyAction.action_type:

harvest
attack
fortify
scout
adapt
bluff
noop

Observation Space

StrategyObservation includes:

task/difficulty/objective
turn counters
active rule and confidence-weighted rule hint
own stats (resources/defense/intel)
noisy opponent estimates
public event log and last actions

Task Set and Graders

Three deterministic tasks with easy -> medium -> hard progression:

easy_frontier_probe
medium_alliance_shuffle
hard_chaos_conclave

Graders return 0.0-1.0 with deterministic components:

outcome margin
adaptation quality
information quality
stability (invalid actions)
efficiency

Reward Shaping

Per-step dense rewards include:

rule-alignment signal
utility delta vs opponent
info gain signal from reduced uncertainty
adaptation bonus
invalid/loop penalties
terminal score alignment bonus

RL Training Setup

Train policy (tabular Q-learning)

cd C:\Users\hitar\OneDrive\Desktop\RLMODEL\multi-agent-strategy-openenv
python train_rl.py

Optional env vars:

TRAIN_EPISODES (default 4000)
TRAIN_ALPHA (default 0.2)
TRAIN_GAMMA (default 0.95)
TRAIN_EPS_START (default 1.0)
TRAIN_EPS_END (default 0.05)

Recommended tuning command for stronger medium performance:

$env:TRAIN_EPISODES="6000"
$env:TRAIN_ALPHA="0.16"
$env:TRAIN_GAMMA="0.98"
$env:TRAIN_EPS_END="0.02"
python train_rl.py

Artifacts:

artifacts/q_policy.json
artifacts/training_history.json

Evaluate trained policy

python evaluate_policy.py

Output:

per-task average score/margin
artifacts/evaluation_report.json

Inference script

python inference.py

inference.py calls evaluate_policy.py and is submission-friendly for automated evaluation.

Frontend Playground

An interactive UI is included to manually control episodes and inspect reward/grader dynamics in real time.

After starting the server, open:

http://127.0.0.1:8000/playground

Frontend capabilities:

start/reset episodes by task and seed
manual action control + optional action payload
auto-step mode using a rule-aware policy
live telemetry cards for resources, defense, intel, and rule hints
live event feed and step timeline
grader breakdown bars and final score view

Windows Setup

python -m pip install -r requirements.txt

If openenv is not on PATH:

& "$env:APPDATA\Python\Python313\Scripts\openenv.exe" --help

Validate and Serve

& "$env:APPDATA\Python\Python313\Scripts\openenv.exe" validate --verbose
python -m uvicorn strategy_env.server.app:app --host 0.0.0.0 --port 8000 --reload

Docker

docker build -t multi-agent-strategy-openenv:latest -f server/Dockerfile .
docker run --rm -p 8000:8000 multi-agent-strategy-openenv:latest

Hugging Face Spaces Deploy

openenv push --repo-id <your-username>/multi-agent-strategy-openenv

Project Layout

multi-agent-strategy-openenv/
|- server/
|  |- app.py
|  |- Dockerfile
|- strategy_env/
|  |- frontend/
|  |  |- index.html
|  |  |- assets/
|  |     |- styles.css
|  |     |- app.js
|  |- models.py
|  |- client.py
|  |- tasks.py
|  |- graders.py
|  |- server/
|     |- app.py
|     |- environment.py
|- artifacts/
|- train_rl.py
|- evaluate_policy.py
|- inference.py
|- openenv.yaml
|- pyproject.toml
|- requirements.txt
|- uv.lock
|- README.md