test / README.md
paramjitbaral's picture
Upload folder using huggingface_hub
4e42a7f verified
|
Raw
History Blame Contribute Delete
7.32 kB
---
title: ACDE OpenEnv
emoji: "🚑"
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
base_path: /web
---
# Emergency Routing Simulation (ACDE)
This project is a simulation environment for emergency ambulance routing.
In simple terms:
- A patient needs urgent care.
- Several hospitals are available.
- Each hospital has trade-offs (distance, traffic, ICU certainty, specialization).
- Conditions can change while the ambulance is moving.
- The agent must decide where to go, step by step.
The goal is not to be perfect every time. The goal is to make realistic decisions under uncertainty.
## What This Project Does
This environment helps you test decision logic in situations where information is incomplete and time is limited.
It supports three difficulty levels:
- `acde_easy`
- `acde_medium`
- `acde_hard`
As difficulty increases, uncertainty and penalties increase too.
## How It Works (Simple Flow)
Every episode follows this loop:
1. The environment is reset with a seed and task.
2. You get an observation:
- Patient condition
- Required specialization
- Hospital list with visible signals
3. The policy scores hospitals.
4. One hospital is selected.
5. The environment validates arrival using hidden checks.
6. You receive outcome + reward.
7. If not done, repeat until success or failure.
## What Makes It Realistic
This is not a static lookup problem. It includes realistic uncertainty:
- Displayed ICU status can differ from actual ICU status.
- Traffic can change between steps.
- Hospital overload can change outcomes.
- Specialist availability can fail at arrival.
- A hospital that failed once may become usable later.
The policy includes safety rules such as:
- Immediate retry protection after rejection.
- Cooldown handling for recently failed hospitals.
- Exploration among top options (not blind random picks).
## Project Layout
Key files:
- `app/environment/core.py`
- Main environment loop (`reset`, `step`, transition logic)
- `app/environment/validation.py`
- Hidden validation checks (ICU, specialist, overload, outcome)
- `app/environment/graders.py`
- Final scoring and pass/fail grading
- `app/models/`
- Pydantic models for state, observation, reward, action
- `app/server/app.py`
- FastAPI server endpoints
- `inference.py`
- Local policy runner (CLI episodes)
- `data/learning_memory.json`
- Rolling policy memory
- `data/trajectory_history.jsonl`
- Per-step trajectory logs
## API Endpoints
When server mode is running:
- `GET /health`
- `POST /reset`
- `POST /step`
- `GET /state`
## Action Space
The agent sends one action per step as JSON:
```json
{
"step": 1,
"hospital_id": "H3",
"rationale": "short decision reason"
}
```
Action fields:
- `step` (int, >=1): must match current environment step
- `hospital_id` (str): target hospital identifier
- `rationale` (str, optional): policy explanation
## Observation Space
Each `reset()` and `step()` returns an observation with:
- episode metadata: `episode_id`, `seed`, `task_id`, `scenario_name`, `scenario_difficulty`
- patient state: `patient_condition`, `required_specialization`, remaining time fields
- hospital list: `hospital_id`, `distance_km`, `icu`, `specialization`, `traffic`
- routing history: visited/failed hospitals and failure reasons
- hidden-state feedback: `last_arrival_outcome` summary (status/reason/suitability)
- memory snapshot used by the baseline policy
Core schema is defined by Pydantic models in:
- `app/models/action.py`
- `app/models/observation.py`
- `app/models/state.py`
- `app/models/reward.py`
## Required Environment Variables
Before running `inference.py`, define:
- `API_BASE_URL`: API base URL for the OpenAI-compatible endpoint
- `MODEL_NAME`: model name used for rationale generation
- `HF_TOKEN`: API key/token
Windows PowerShell example:
```powershell
$env:API_BASE_URL = "https://api-inference.huggingface.co/v1"
$env:MODEL_NAME = "your-model-id"
$env:HF_TOKEN = "your-token"
```
## Installation
## 1) Prerequisites
- Python 3.10+ (3.12 works)
- `pip`
## 2) Open a terminal in this folder
Folder should be:
- `my_env`
## 3) Create and activate a virtual environment (recommended)
Windows PowerShell:
```powershell
python -m venv .venv
.\.venv\Scripts\Activate.ps1
```
macOS/Linux:
```bash
python -m venv .venv
source .venv/bin/activate
```
## 4) Install dependencies
```bash
pip install -e .
```
If editable install is not needed:
```bash
pip install .
```
## Running the Project
## Option A: Run policy episodes directly (most common)
Run one medium episode:
```bash
python inference.py --mode single --task acde_medium --episodes 1 --seed 555
```
Run 10 hard episodes:
```bash
python inference.py --mode single --task acde_hard --episodes 10 --seed 555
```
Run all levels in sequence:
```bash
python inference.py --mode full --episodes 3 --seed 555
```
If you run without `--task`, the script asks for level interactively.
## Option B: Run as HTTP service
Start API server:
```bash
uvicorn app.server.app:app --host 0.0.0.0 --port 7860
```
Health check:
```bash
curl http://127.0.0.1:7860/health
```
## Understanding Output
During `inference.py` runs, you will see:
- Scenario details
- Hospital options and scores
- Decision strategy text
- Outcome per step (`ACCEPTED`, `PARTIAL`, `REJECTED`)
- Final episode summary
- Batch summary (success rate, average score, average steps)
Example summary:
```text
Batch summary:
Success rate: 20.0%
Average score: 0.39
Average steps: 3.6
```
## Data Files
The simulation writes data to `data/`:
- `learning_memory.json`
- Long-term policy memory
- `trajectory_history.jsonl`
- One JSON object per step
- `learning_archive.json`
- Aggregate run history and profiles
If you want a clean run baseline, back up and clear these files.
## Typical Targets (Guideline)
These are practical targets, not strict rules:
- Easy: usually high success, often fewer steps
- Medium: mixed outcomes with meaningful rerouting
- Hard: lower success, more failures, more steps
If hard success is too high, increase uncertainty or rejection pressure.
If hard success is too low, ease one or two hard-only probabilities.
## Troubleshooting
## "NameError" or model field errors
Make sure model fields and observation fields match after logic changes.
If you added new state keys, also add them in observation models.
## Script asks for seed/level unexpectedly
Pass flags explicitly:
```bash
python inference.py --mode single --task acde_hard --episodes 10 --seed 555
```
## No module named app
Run commands from inside `my_env` folder, and ensure install succeeded:
```bash
pip install -e .
```
## Uvicorn command not found
Install server deps in your active environment:
```bash
pip install uvicorn fastapi
```
## Notes
- This project is designed for iterative policy tuning.
- Small changes in hard-mode probabilities can noticeably shift success rates.
- Always test with at least 10-30 episodes before concluding behavior changes.