test / README.md
paramjitbaral's picture
Upload folder using huggingface_hub
4e42a7f verified
|
Raw
History Blame Contribute Delete
7.32 kB
metadata
title: ACDE OpenEnv
emoji: 🚑
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
base_path: /web

Emergency Routing Simulation (ACDE)

This project is a simulation environment for emergency ambulance routing.

In simple terms:

  • A patient needs urgent care.
  • Several hospitals are available.
  • Each hospital has trade-offs (distance, traffic, ICU certainty, specialization).
  • Conditions can change while the ambulance is moving.
  • The agent must decide where to go, step by step.

The goal is not to be perfect every time. The goal is to make realistic decisions under uncertainty.

What This Project Does

This environment helps you test decision logic in situations where information is incomplete and time is limited.

It supports three difficulty levels:

  • acde_easy
  • acde_medium
  • acde_hard

As difficulty increases, uncertainty and penalties increase too.

How It Works (Simple Flow)

Every episode follows this loop:

  1. The environment is reset with a seed and task.
  2. You get an observation:
  • Patient condition
  • Required specialization
  • Hospital list with visible signals
  1. The policy scores hospitals.
  2. One hospital is selected.
  3. The environment validates arrival using hidden checks.
  4. You receive outcome + reward.
  5. If not done, repeat until success or failure.

What Makes It Realistic

This is not a static lookup problem. It includes realistic uncertainty:

  • Displayed ICU status can differ from actual ICU status.
  • Traffic can change between steps.
  • Hospital overload can change outcomes.
  • Specialist availability can fail at arrival.
  • A hospital that failed once may become usable later.

The policy includes safety rules such as:

  • Immediate retry protection after rejection.
  • Cooldown handling for recently failed hospitals.
  • Exploration among top options (not blind random picks).

Project Layout

Key files:

  • app/environment/core.py
    • Main environment loop (reset, step, transition logic)
  • app/environment/validation.py
    • Hidden validation checks (ICU, specialist, overload, outcome)
  • app/environment/graders.py
    • Final scoring and pass/fail grading
  • app/models/
    • Pydantic models for state, observation, reward, action
  • app/server/app.py
    • FastAPI server endpoints
  • inference.py
    • Local policy runner (CLI episodes)
  • data/learning_memory.json
    • Rolling policy memory
  • data/trajectory_history.jsonl
    • Per-step trajectory logs

API Endpoints

When server mode is running:

  • GET /health
  • POST /reset
  • POST /step
  • GET /state

Action Space

The agent sends one action per step as JSON:

{
  "step": 1,
  "hospital_id": "H3",
  "rationale": "short decision reason"
}

Action fields:

  • step (int, >=1): must match current environment step
  • hospital_id (str): target hospital identifier
  • rationale (str, optional): policy explanation

Observation Space

Each reset() and step() returns an observation with:

  • episode metadata: episode_id, seed, task_id, scenario_name, scenario_difficulty
  • patient state: patient_condition, required_specialization, remaining time fields
  • hospital list: hospital_id, distance_km, icu, specialization, traffic
  • routing history: visited/failed hospitals and failure reasons
  • hidden-state feedback: last_arrival_outcome summary (status/reason/suitability)
  • memory snapshot used by the baseline policy

Core schema is defined by Pydantic models in:

  • app/models/action.py
  • app/models/observation.py
  • app/models/state.py
  • app/models/reward.py

Required Environment Variables

Before running inference.py, define:

  • API_BASE_URL: API base URL for the OpenAI-compatible endpoint
  • MODEL_NAME: model name used for rationale generation
  • HF_TOKEN: API key/token

Windows PowerShell example:

$env:API_BASE_URL = "https://api-inference.huggingface.co/v1"
$env:MODEL_NAME = "your-model-id"
$env:HF_TOKEN = "your-token"

Installation

1) Prerequisites

  • Python 3.10+ (3.12 works)
  • pip

2) Open a terminal in this folder

Folder should be:

  • my_env

3) Create and activate a virtual environment (recommended)

Windows PowerShell:

python -m venv .venv
.\.venv\Scripts\Activate.ps1

macOS/Linux:

python -m venv .venv
source .venv/bin/activate

4) Install dependencies

pip install -e .

If editable install is not needed:

pip install .

Running the Project

Option A: Run policy episodes directly (most common)

Run one medium episode:

python inference.py --mode single --task acde_medium --episodes 1 --seed 555

Run 10 hard episodes:

python inference.py --mode single --task acde_hard --episodes 10 --seed 555

Run all levels in sequence:

python inference.py --mode full --episodes 3 --seed 555

If you run without --task, the script asks for level interactively.

Option B: Run as HTTP service

Start API server:

uvicorn app.server.app:app --host 0.0.0.0 --port 7860

Health check:

curl http://127.0.0.1:7860/health

Understanding Output

During inference.py runs, you will see:

  • Scenario details
  • Hospital options and scores
  • Decision strategy text
  • Outcome per step (ACCEPTED, PARTIAL, REJECTED)
  • Final episode summary
  • Batch summary (success rate, average score, average steps)

Example summary:

Batch summary:
  Success rate: 20.0%
  Average score: 0.39
  Average steps: 3.6

Data Files

The simulation writes data to data/:

  • learning_memory.json
    • Long-term policy memory
  • trajectory_history.jsonl
    • One JSON object per step
  • learning_archive.json
    • Aggregate run history and profiles

If you want a clean run baseline, back up and clear these files.

Typical Targets (Guideline)

These are practical targets, not strict rules:

  • Easy: usually high success, often fewer steps
  • Medium: mixed outcomes with meaningful rerouting
  • Hard: lower success, more failures, more steps

If hard success is too high, increase uncertainty or rejection pressure. If hard success is too low, ease one or two hard-only probabilities.

Troubleshooting

"NameError" or model field errors

Make sure model fields and observation fields match after logic changes. If you added new state keys, also add them in observation models.

Script asks for seed/level unexpectedly

Pass flags explicitly:

python inference.py --mode single --task acde_hard --episodes 10 --seed 555

No module named app

Run commands from inside my_env folder, and ensure install succeeded:

pip install -e .

Uvicorn command not found

Install server deps in your active environment:

pip install uvicorn fastapi

Notes

  • This project is designed for iterative policy tuning.
  • Small changes in hard-mode probabilities can noticeably shift success rates.
  • Always test with at least 10-30 episodes before concluding behavior changes.