multi-agent-strat / README.md
Avnishjain's picture
Upload 21 files
6888575 verified
metadata
title: multi-agent-strategy-openenv
sdk: docker
app_port: 8000
tags:
  - openenv
  - multi-agent
  - reinforcement-learning
  - strategy

Multi-Agent Strategy OpenEnv

Problem statement implemented:

Agents compete in a strategic game environment with incomplete information and evolving rules.

This environment provides a strategic benchmark where one learning agent plays against a hidden-policy opponent under fog-of-war and a shifting global rulebook.

Why This Matches the Statement

  • Multi-agent: learner vs opponent with distinct strategies (aggressive, adaptive, deceptive).
  • Incomplete information: agent sees noisy opponent stats and uncertain rule hints.
  • Evolving rules: scoring multipliers shift periodically and randomly during episodes.

Environment API

Implemented with typed OpenEnv models and standard methods:

  • reset()
  • step(action)
  • state()

Core files:

  • strategy_env/models.py
  • strategy_env/server/environment.py
  • strategy_env/server/app.py
  • openenv.yaml

Action Space

StrategyAction.action_type:

  • harvest
  • attack
  • fortify
  • scout
  • adapt
  • bluff
  • noop

Observation Space

StrategyObservation includes:

  • task/difficulty/objective
  • turn counters
  • active rule and confidence-weighted rule hint
  • own stats (resources/defense/intel)
  • noisy opponent estimates
  • public event log and last actions

Task Set and Graders

Three deterministic tasks with easy -> medium -> hard progression:

  • easy_frontier_probe
  • medium_alliance_shuffle
  • hard_chaos_conclave

Graders return 0.0-1.0 with deterministic components:

  • outcome margin
  • adaptation quality
  • information quality
  • stability (invalid actions)
  • efficiency

Reward Shaping

Per-step dense rewards include:

  • rule-alignment signal
  • utility delta vs opponent
  • info gain signal from reduced uncertainty
  • adaptation bonus
  • invalid/loop penalties
  • terminal score alignment bonus

RL Training Setup

Train policy (tabular Q-learning)

cd C:\Users\hitar\OneDrive\Desktop\RLMODEL\multi-agent-strategy-openenv
python train_rl.py

Optional env vars:

  • TRAIN_EPISODES (default 4000)
  • TRAIN_ALPHA (default 0.2)
  • TRAIN_GAMMA (default 0.95)
  • TRAIN_EPS_START (default 1.0)
  • TRAIN_EPS_END (default 0.05)

Recommended tuning command for stronger medium performance:

$env:TRAIN_EPISODES="6000"
$env:TRAIN_ALPHA="0.16"
$env:TRAIN_GAMMA="0.98"
$env:TRAIN_EPS_END="0.02"
python train_rl.py

Artifacts:

  • artifacts/q_policy.json
  • artifacts/training_history.json

Evaluate trained policy

python evaluate_policy.py

Output:

  • per-task average score/margin
  • artifacts/evaluation_report.json

Inference script

python inference.py

inference.py calls evaluate_policy.py and is submission-friendly for automated evaluation.

Frontend Playground

An interactive UI is included to manually control episodes and inspect reward/grader dynamics in real time.

After starting the server, open:

  • http://127.0.0.1:8000/playground

Frontend capabilities:

  • start/reset episodes by task and seed
  • manual action control + optional action payload
  • auto-step mode using a rule-aware policy
  • live telemetry cards for resources, defense, intel, and rule hints
  • live event feed and step timeline
  • grader breakdown bars and final score view

Windows Setup

python -m pip install -r requirements.txt

If openenv is not on PATH:

& "$env:APPDATA\Python\Python313\Scripts\openenv.exe" --help

Validate and Serve

& "$env:APPDATA\Python\Python313\Scripts\openenv.exe" validate --verbose
python -m uvicorn strategy_env.server.app:app --host 0.0.0.0 --port 8000 --reload

Docker

docker build -t multi-agent-strategy-openenv:latest -f server/Dockerfile .
docker run --rm -p 8000:8000 multi-agent-strategy-openenv:latest

Hugging Face Spaces Deploy

openenv push --repo-id <your-username>/multi-agent-strategy-openenv

Project Layout

multi-agent-strategy-openenv/
|- server/
|  |- app.py
|  |- Dockerfile
|- strategy_env/
|  |- frontend/
|  |  |- index.html
|  |  |- assets/
|  |     |- styles.css
|  |     |- app.js
|  |- models.py
|  |- client.py
|  |- tasks.py
|  |- graders.py
|  |- server/
|     |- app.py
|     |- environment.py
|- artifacts/
|- train_rl.py
|- evaluate_policy.py
|- inference.py
|- openenv.yaml
|- pyproject.toml
|- requirements.txt
|- uv.lock
|- README.md