---
title: Sepsis OpenEnv
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
tags:
  - openenv
  - healthcare
  - offline-rl
  - sepsis
---

# Sepsis OpenEnv

Sepsis OpenEnv is a real-world sequential sepsis management environment for the OpenEnv hackathon workflow. It exposes a standard `reset()` / `step()` / `state()` loop and evaluates how well an agent gathers information, chooses treatment, and manages a logged ICU trajectory under partial observability.

The environment is designed to satisfy the Round 1 requirements:

- real-world task: ICU sepsis workup and treatment decisions
- typed models for action, observation, and state
- 3 graded tasks: `easy`, `medium`, `hard`
- dense rewards with safety penalties and partial-progress signal
- reproducible root-level `inference.py`
- Dockerized server for local and Hugging Face deployment

## What The Environment Simulates

At each step, the agent can:

- request one lab from a clinically meaningful set
- request one treatment plan from a sepsis-management action set
- optionally mark the current state as suspected sepsis

The environment advances along a logged patient trajectory and rewards the agent for:

- detecting likely sepsis early
- requesting informative labs instead of repeatedly querying low-value tests
- selecting treatment plans that fit the hidden severity pattern in the logged stay
- avoiding obviously unsafe escalation or under-treatment

This is an offline environment built from a compact processed bundle derived from the MIMIC-III demo cohort. It is inspired by the WD3QNE sepsis-treatment paper, but the environment is purpose-built for OpenEnv evaluation rather than paper reproduction.

## Tasks

Task definitions live in `tasks.py`.

- `easy`: early sepsis workup from partial bedside data with an emphasis on timely lab selection
- `medium`: diagnosis plus early treatment initiation after iterative lab requests
- `hard`: full sepsis management across longer unstable trajectories with stabilization and outcome pressure

Each task has a deterministic grader in `graders.py` that returns a score in `[0.0, 1.0]`.

## Action Space

Defined in `models.py`.

- `action_type`: `request_lab`, `request_treatment`, or `monitor`
- `suspect_sepsis`: boolean detection signal
- `lab_type`: one of `lactate`, `wbc`, `creatinine`, `bicarbonate`, `platelets`, `bilirubin`
- `treatment_type`: one of `monitor`, `fluids`, `vasopressors`, `combination`

## Observation Space

Defined in `models.py`.

Each observation contains:

- task id and task description
- current patient trajectory id
- current step and max steps
- severity proxy
- mortality flag from the logged stay
- demographics and always-visible vitals
- visible non-lab context features
- only the labs explicitly requested so far
- current cumulative reward and last reward

Hidden logged treatment choices and unrevealed labs are intentionally not exposed in observations.

## Reward Design

The reward function is dense, not purely terminal.

Per step:

- positive signal for early sepsis suspicion on high-risk states
- reward for requesting priority labs that fit the current presentation
- reward for selecting treatment plans that match the hidden severity pattern
- progress bonus when the next logged state becomes less severe
- novelty bonus for new state-action exploration
- penalties for duplicate labs, repeated low-value actions, unsafe escalation, or obvious under-treatment

At the end of the episode:

- bonus for survival trajectories
- penalty for death trajectories

## Core Files

- `openenv.yaml`: OpenEnv metadata
- `models.py`: typed action / observation / state models
- `tasks.py`: task catalog
- `graders.py`: deterministic graders
- `client.py`: client wrapper
- `server/app.py`: FastAPI app and server entrypoint
- `server/sepsis_environment.py`: environment implementation
- `inference.py`: baseline runner
- `validate_local.py`: local smoke checks
- `prepare_submission.py`: creates a clean submission bundle

## Setup

Create a virtual environment and install dependencies:

```bash
python -m venv .venv
.venv\Scripts\python.exe -m pip install --upgrade pip
.venv\Scripts\python.exe -m pip install -r requirements.txt
```

Run local validation:

```bash
.venv\Scripts\python.exe validate_local.py
```

Run the official OpenEnv validator:

```bash
.venv\Scripts\openenv.exe validate
```

Start the environment server locally:

```bash
.venv\Scripts\python.exe -m uvicorn server.app:app --host 0.0.0.0 --port 7860
```

Quick checks:

```bash
curl http://127.0.0.1:7860/health
curl http://127.0.0.1:7860/metadata
```

## Baseline Inference

The required root-level baseline script is `inference.py`.

Run locally:

```bash
.venv\Scripts\python.exe inference.py
```

The script:

- writes reproducible scores to `outputs/baseline_scores.json`
- emits OpenEnv-style `[START]`, `[STEP]`, and `[END]` lines to stdout
- uses the OpenAI client if `API_BASE_URL`, `MODEL_NAME`, and `HF_TOKEN` are set
- otherwise falls back to a deterministic staged baseline policy

Current deterministic baseline scores from the local run:

- `easy`: `1.0`
- `medium`: `1.0`
- `hard`: `0.96`
- mean score: `0.9867`

## Docker

Build:

```bash
docker build -t sepsis-openenv .
```

Run:

```bash
docker run -p 7860:7860 sepsis-openenv
```

The container exposes a working `/health` endpoint and responds to `/reset`.

## Submission Bundle

To prepare a clean hackathon-ready bundle:

```bash
.venv\Scripts\python.exe prepare_submission.py
```

This creates `submission_bundle/` with only the files needed for the environment runtime and submission packaging.

## Runtime Assets

The runtime uses the preprocessed assets in:

- `env_data/processed_demo_dataset.pkl`
- `env_data/selected_features.json`

This keeps the environment lightweight enough for the hackathon resource limits.

## Validation Status

The following checks have been run locally:

- `python validate_local.py`: passed
- `python inference.py`: passed
- `openenv validate`: passed
- `docker build -t sepsis-openenv .`: passed
- `docker run -p 7860:7860 sepsis-openenv`: passed
- `/health` and `/metadata`: passed

## Inspiration

Wu, X., Li, R., He, Z. et al. *A value-based deep reinforcement learning model with human expertise in optimal treatment of sepsis.* npj Digital Medicine 6, 15 (2023). https://doi.org/10.1038/s41746-023-00755-5