prana_env / README.md
burtenshaw's picture
burtenshaw HF Staff
Update README.md
2a09fed verified
metadata
title: PRANA-Env Environment Server
emoji: πŸ₯
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
  - openenv
  - reinforcement-learning
  - clinical

PRANA-Env

Policy Reinforced Administrative Navigation Agent β€” an OpenEnv RL environment for kidney transplant administration.

PRANA-Env simulates the multi-step clinical workflow required to file a KARS-compliant SRTR report for a transplant candidate. The agent must query fragmented datastores, detect stale lab values, and file a complete report β€” earning rewards from a deterministic KARS validator.

Architecture

LLM Agent (GPT-4o / fine-tuned model)
        β”‚
        β”‚  query_db / record_value / file_report
        β–Ό
  PranaEnv Client  ──(WebSocket)──  PranaEnvironment Server
                                          β”‚
                                    KARS Validator
                                    (reward signal)

Action Space

Action Required fields Effect
query_db target, field, patient_id Returns current value from PatientDB
record_value field, value Writes value into episode record with today's timestamp
file_report β€” KARS validates record β†’ reward β†’ done

Observation Space

Every observation includes:

PranaObservation(
    query_result      # str: value, NOT_FOUND, RECORDED, KARS status
    active_task       # str: current task context (t1–t5)
    recorded_fields   # dict: {field: {value, recorded_at}} β€” full current record
    missing_fields    # list[str]: KARS issues after file_report
    kars_result       # str | None: "PASSED" | "FAILED"
    reward            # float
    done              # bool
)

recorded_fields shows the agent its full current state including timestamps β€” enabling staleness detection and selective re-querying.

Reward Signal

Event Reward
KARS PASSED β€” first attempt +15
KARS PASSED β€” after correction +10
Re-query of already-fresh field βˆ’1
KARS FAILED β€” missing or stale fields βˆ’5
KARS FAILED β€” unrecoverable (3 attempts) βˆ’10

Temporal Model (T1 β†’ T5)

Episodes simulate a 4-month clinical timeline:

  • T1 (2025-11-07): Initial labs recorded. Snapshot pre-loaded into episode record on reset().
  • T5 (2026-03-07): Filing date. KARS requires time-sensitive fields within 90 days.

On reset(), the agent sees a pre-populated record with stale T1 values. It must:

  1. Identify which fields are stale (hba1c, gfr, creatinine β€” time-sensitive)
  2. Re-query only those fields to get current T5 values
  3. Leave stable fields (blood_type) untouched β€” re-querying incurs a penalty
  4. File when the record is complete and fresh

Example trajectory:

reset() β†’ record pre-loaded: {hba1c: {value: 7.2, recorded_at: 2025-11-07}, ...}

query_db(hba1c)      β†’ 8.9   (T5 value β€” GFR worsened)
query_db(gfr)        β†’ 12.1  (was 18.5 at T1)
query_db(creatinine) β†’ 4.7   (was 3.8 at T1)
record_value Γ— 3
file_report()        β†’ KARS PASSED, reward=+15

Quick Start

# Start the server
conda activate openenv
uvicorn server.app:app --host 0.0.0.0 --port 8000
# Run the LLM agent loop
python test_agent.py
# Run N episodes for GRPO rollout batch
from test_agent import run_episodes

trajectories = run_episodes(
    task="File a KARS-compliant SRTR report for patient P001. "
         "A T1 record exists from 4 months ago. "
         "Check which fields are stale, re-query only what's needed, and file.",
    patient_id="P001",
    n=8,  # GRPO batch size
)

Patients

ID Condition T1 GFR T5 GFR HbA1c T1β†’T5 Notes
P001 CKD Stage 4 18.5 12.1 7.2β†’8.9 Complete record
P002 Diabetic nephropathy 11.0 8.3 9.1β†’10.2 Antihypertensives, insulin
P003 CKD Stage 3 22.3 19.8 null HbA1c never recorded, inactive waitlist

KARS Required Fields

Field Source Time-sensitive
hba1c PatientDB Yes β€” 90-day window
gfr PatientDB Yes β€” 90-day window
creatinine PatientDB Yes β€” 90-day window
blood_type PatientDB No β€” stable

Project Structure

prana_env/
β”œβ”€β”€ client.py                      # PranaEnv WebSocket client
β”œβ”€β”€ models.py                      # PranaAction, PranaObservation
β”œβ”€β”€ test_agent.py                  # LLM agent RL loop (GPT-4o)
β”œβ”€β”€ test_client.py                 # Smoke test client
β”œβ”€β”€ data/
β”‚   └── patient_db.json            # Patient records with T1 snapshots and T5 values
└── server/
    β”œβ”€β”€ app.py                     # FastAPI + WebSocket server
    β”œβ”€β”€ prana_env_environment.py   # RL environment: actions, KARS validator, rewards
    └── Dockerfile

Connecting to an Existing Server

from prana_env.client import PranaEnv
from prana_env.models import PranaAction

with PranaEnv(base_url="http://localhost:8000") as env:
    result = env.reset(patient_id="P001")
    print(result.observation.query_result)

    result = env.step(PranaAction(action_type="query_db", target="PatientDB",
                                  field="hba1c", patient_id="P001"))
    print(result.observation.query_result)   # "8.9"
    print(result.observation.recorded_fields)  # current record state

Deploying to Hugging Face Spaces

openenv push
# or
openenv push --repo-id my-org/prana-env --private

After deployment:

  • Web UI: /web
  • API docs: /docs
  • Health: /health
  • WebSocket: /ws