prana_env

Sleeping

App Files Files Community

pbanavara commited on Mar 8

Commit

75a4eab

verified ·

1 Parent(s): fb9e30c

Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

README.md +124 -197
client.py +4 -14
data/patient_db.json +35 -11
models.py +25 -15
server/prana_env_environment.py +318 -34
test_agent.py +227 -0
test_client.py +12 -9

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
-title: Prana Env Environment Server
-emoji: 🏒
 colorFrom: purple
 colorTo: indigo
 sdk: docker
@@ -9,247 +9,174 @@ app_port: 8000
 base_path: /web
 tags:
   - openenv
 ---
-# Prana Env Environment
-A simple test environment that echoes back messages. Perfect for testing the env APIs as well as demonstrating environment usage patterns.
-## Quick Start
-The simplest way to use the Prana Env environment is through the `PranaEnv` class:
-```python
-from prana_env import PranaAction, PranaEnv
-try:
-    # Create environment from Docker image
-    prana_envenv = PranaEnv.from_docker_image("prana_env-env:latest")
-    # Reset
-    result = prana_envenv.reset()
-    print(f"Reset: {result.observation.echoed_message}")
-    # Send multiple messages
-    messages = ["Hello, World!", "Testing echo", "Final message"]
-    for msg in messages:
-        result = prana_envenv.step(PranaAction(message=msg))
-        print(f"Sent: '{msg}'")
-        print(f"  → Echoed: '{result.observation.echoed_message}'")
-        print(f"  → Length: {result.observation.message_length}")
-        print(f"  → Reward: {result.reward}")
-finally:
-    # Always clean up
-    prana_envenv.close()
 ```
-That's it! The `PranaEnv.from_docker_image()` method handles:
-- Starting the Docker container
-- Waiting for the server to be ready
-- Connecting to the environment
-- Container cleanup when you call `close()`
-## Building the Docker Image
-Before using the environment, you need to build the Docker image:
-```bash
-# From project root
-docker build -t prana_env-env:latest -f server/Dockerfile .
-```
-## Deploying to Hugging Face Spaces
-You can easily deploy your OpenEnv environment to Hugging Face Spaces using the `openenv push` command:
-```bash
-# From the environment directory (where openenv.yaml is located)
-openenv push
-# Or specify options
-openenv push --namespace my-org --private
 ```
-The `openenv push` command will:
-1. Validate that the directory is an OpenEnv environment (checks for `openenv.yaml`)
-2. Prepare a custom build for Hugging Face Docker space (enables web interface)
-3. Upload to Hugging Face (ensuring you're logged in)
-### Prerequisites
-- Authenticate with Hugging Face: The command will prompt for login if not already authenticated
-### Options
-- `--directory`, `-d`: Directory containing the OpenEnv environment (defaults to current directory)
-- `--repo-id`, `-r`: Repository ID in format 'username/repo-name' (defaults to 'username/env-name' from openenv.yaml)
-- `--base-image`, `-b`: Base Docker image to use (overrides Dockerfile FROM)
-- `--private`: Deploy the space as private (default: public)
-### Examples
 ```bash
-# Push to your personal namespace (defaults to username/env-name from openenv.yaml)
-openenv push
-# Push to a specific repository
-openenv push --repo-id my-org/my-env
-# Push with a custom base image
-openenv push --base-image ghcr.io/meta-pytorch/openenv-base:latest
-# Push as a private space
-openenv push --private
-# Combine options
-openenv push --repo-id my-org/my-env --base-image custom-base:latest --private
 ```
-After deployment, your space will be available at:
-`https://huggingface.co/spaces/<repo-id>`
-The deployed space includes:
-- **Web Interface** at `/web` - Interactive UI for exploring the environment
-- **API Documentation** at `/docs` - Full OpenAPI/Swagger interface
-- **Health Check** at `/health` - Container health monitoring
-- **WebSocket** at `/ws` - Persistent session endpoint for low-latency interactions
-## Environment Details
-### Action
-**PranaAction**: Contains a single field
-- `message` (str) - The message to echo back
-### Observation
-**PranaObservation**: Contains the echo response and metadata
-- `echoed_message` (str) - The message echoed back
-- `message_length` (int) - Length of the message
-- `reward` (float) - Reward based on message length (length × 0.1)
-- `done` (bool) - Always False for echo environment
-- `metadata` (dict) - Additional info like step count
-### Reward
-The reward is calculated as: `message_length × 0.1`
-- "Hi" → reward: 0.2
-- "Hello, World!" → reward: 1.3
-- Empty message → reward: 0.0
-## Advanced Usage
-### Connecting to an Existing Server
-If you already have a Prana Env environment server running, you can connect directly:
 ```python
-from prana_env import PranaEnv
-# Connect to existing server
-prana_envenv = PranaEnv(base_url="<ENV_HTTP_URL_HERE>")
-# Use as normal
-result = prana_envenv.reset()
-result = prana_envenv.step(PranaAction(message="Hello!"))
 ```
-Note: When connecting to an existing server, `prana_envenv.close()` will NOT stop the server.
-### Using the Context Manager
-The client supports context manager usage for automatic connection management:
 ```python
-from prana_env import PranaAction, PranaEnv
-# Connect with context manager (auto-connects and closes)
-with PranaEnv(base_url="http://localhost:8000") as env:
-    result = env.reset()
-    print(f"Reset: {result.observation.echoed_message}")
-    # Multiple steps with low latency
-    for msg in ["Hello", "World", "!"]:
-        result = env.step(PranaAction(message=msg))
-        print(f"Echoed: {result.observation.echoed_message}")
 ```
-The client uses WebSocket connections for:
-- **Lower latency**: No HTTP connection overhead per request
-- **Persistent session**: Server maintains your environment state
-- **Efficient for episodes**: Better for many sequential steps
-### Concurrent WebSocket Sessions
-The server supports multiple concurrent WebSocket connections. To enable this,
-modify `server/app.py` to use factory mode:
-```python
-# In server/app.py - use factory mode for concurrent sessions
-app = create_app(
-    PranaEnvironment,  # Pass class, not instance
-    PranaAction,
-    PranaObservation,
-    max_concurrent_envs=4,  # Allow 4 concurrent sessions
-)
-```
-Then multiple clients can connect simultaneously:
-```python
-from prana_env import PranaAction, PranaEnv
-from concurrent.futures import ThreadPoolExecutor
-def run_episode(client_id: int):
-    with PranaEnv(base_url="http://localhost:8000") as env:
-        result = env.reset()
-        for i in range(10):
-            result = env.step(PranaAction(message=f"Client {client_id}, step {i}"))
-        return client_id, result.observation.message_length
-# Run 4 episodes concurrently
-with ThreadPoolExecutor(max_workers=4) as executor:
-    results = list(executor.map(run_episode, range(4)))
 ```
-## Development & Testing
-### Direct Environment Testing
-Test the environment logic directly without starting the HTTP server:
-```bash
-# From the server directory
-python3 server/prana_env_environment.py
 ```
-This verifies that:
-- Environment resets correctly
-- Step executes actions properly
-- State tracking works
-- Rewards are calculated correctly
-### Running Locally
-Run the server locally for development:
 ```bash
-uvicorn server.app:app --reload
 ```
-## Project Structure
-```
-prana_env/
-├── .dockerignore         # Docker build exclusions
-├── __init__.py            # Module exports
-├── README.md              # This file
-├── openenv.yaml           # OpenEnv manifest
-├── pyproject.toml         # Project metadata and dependencies
-├── uv.lock                # Locked dependencies (generated)
-├── client.py              # PranaEnv client
-├── models.py              # Action and Observation models
-└── server/
-    ├── __init__.py        # Server module exports
-    ├── prana_env_environment.py  # Core environment logic
-    ├── app.py             # FastAPI application (HTTP + WebSocket endpoints)
-    └── Dockerfile         # Container image definition
-```

 ---
+title: PRANA-Env Environment Server
+emoji: 🏥
 colorFrom: purple
 colorTo: indigo
 sdk: docker
 base_path: /web
 tags:
   - openenv
+  - reinforcement-learning
+  - clinical
 ---
+# PRANA-Env
+**Policy Reinforced Administrative Navigation Agent** — an OpenEnv RL environment for kidney transplant administration.
+PRANA-Env simulates the multi-step clinical workflow required to file a KARS-compliant SRTR report for a transplant candidate. The agent must query fragmented datastores, detect stale lab values, and file a complete report — earning rewards from a deterministic KARS validator.
+## Architecture
+```
+LLM Agent (GPT-4o / fine-tuned model)
+        │
+        │  query_db / record_value / file_report
+        ▼
+  PranaEnv Client  ──(WebSocket)──  PranaEnvironment Server
+                                          │
+                                    KARS Validator
+                                    (reward signal)
+```
+## Action Space
+| Action | Required fields | Effect |
+|--------|----------------|--------|
+| `query_db` | `target`, `field`, `patient_id` | Returns current value from PatientDB |
+| `record_value` | `field`, `value` | Writes value into episode record with today's timestamp |
+| `file_report` | — | KARS validates record → reward → done |
+## Observation Space
+Every observation includes:
+```python
+PranaObservation(
+    query_result      # str: value, NOT_FOUND, RECORDED, KARS status
+    active_task       # str: current task context (t1–t5)
+    recorded_fields   # dict: {field: {value, recorded_at}} — full current record
+    missing_fields    # list[str]: KARS issues after file_report
+    kars_result       # str | None: "PASSED" | "FAILED"
+    reward            # float
+    done              # bool
+)
 ```
+`recorded_fields` shows the agent its full current state including timestamps — enabling staleness detection and selective re-querying.
+## Reward Signal
+| Event | Reward |
+|-------|--------|
+| KARS PASSED — first attempt | **+15** |
+| KARS PASSED — after correction | **+10** |
+| Re-query of already-fresh field | **−1** |
+| KARS FAILED — missing or stale fields | **−5** |
+| KARS FAILED — unrecoverable (3 attempts) | **−10** |
+## Temporal Model (T1 → T5)
+Episodes simulate a 4-month clinical timeline:
+- **T1 (2025-11-07)**: Initial labs recorded. Snapshot pre-loaded into episode record on `reset()`.
+- **T5 (2026-03-07)**: Filing date. KARS requires time-sensitive fields within **90 days**.
+On `reset()`, the agent sees a pre-populated record with stale T1 values. It must:
+1. Identify which fields are stale (`hba1c`, `gfr`, `creatinine` — time-sensitive)
+2. Re-query only those fields to get current T5 values
+3. Leave stable fields (`blood_type`) untouched — re-querying incurs a penalty
+4. File when the record is complete and fresh
+**Example trajectory:**
 ```
+reset() → record pre-loaded: {hba1c: {value: 7.2, recorded_at: 2025-11-07}, ...}
+query_db(hba1c)      → 8.9   (T5 value — GFR worsened)
+query_db(gfr)        → 12.1  (was 18.5 at T1)
+query_db(creatinine) → 4.7   (was 3.8 at T1)
+record_value × 3
+file_report()        → KARS PASSED, reward=+15
+```
+## Quick Start
 ```bash
+# Start the server
+conda activate openenv
+uvicorn server.app:app --host 0.0.0.0 --port 8000
 ```
 ```python
+# Run the LLM agent loop
+python test_agent.py
 ```
 ```python
+# Run N episodes for GRPO rollout batch
+from test_agent import run_episodes
+trajectories = run_episodes(
+    task="File a KARS-compliant SRTR report for patient P001. "
+         "A T1 record exists from 4 months ago. "
+         "Check which fields are stale, re-query only what's needed, and file.",
+    patient_id="P001",
+    n=8,  # GRPO batch size
+)
 ```
+## Patients
+| ID | Condition | T1 GFR | T5 GFR | HbA1c T1→T5 | Notes |
+|----|-----------|--------|--------|-------------|-------|
+| P001 | CKD Stage 4 | 18.5 | 12.1 | 7.2→8.9 | Complete record |
+| P002 | Diabetic nephropathy | 11.0 | 8.3 | 9.1→10.2 | Antihypertensives, insulin |
+| P003 | CKD Stage 3 | 22.3 | 19.8 | null | HbA1c never recorded, inactive waitlist |
+## KARS Required Fields
+| Field | Source | Time-sensitive |
+|-------|--------|---------------|
+| `hba1c` | PatientDB | Yes — 90-day window |
+| `gfr` | PatientDB | Yes — 90-day window |
+| `creatinine` | PatientDB | Yes — 90-day window |
+| `blood_type` | PatientDB | No — stable |
+## Project Structure
+```
+prana_env/
+├── client.py                      # PranaEnv WebSocket client
+├── models.py                      # PranaAction, PranaObservation
+├── test_agent.py                  # LLM agent RL loop (GPT-4o)
+├── test_client.py                 # Smoke test client
+├── data/
+│   └── patient_db.json            # Patient records with T1 snapshots and T5 values
+└── server/
+    ├── app.py                     # FastAPI + WebSocket server
+    ├── prana_env_environment.py   # RL environment: actions, KARS validator, rewards
+    └── Dockerfile
 ```
+## Connecting to an Existing Server
+```python
+from prana_env.client import PranaEnv
+from prana_env.models import PranaAction
+with PranaEnv(base_url="http://localhost:8000") as env:
+    result = env.reset(patient_id="P001")
+    print(result.observation.query_result)
+    result = env.step(PranaAction(action_type="query_db", target="PatientDB",
+                                  field="hba1c", patient_id="P001"))
+    print(result.observation.query_result)   # "8.9"
+    print(result.observation.recorded_fields)  # current record state
 ```
+## Deploying to Hugging Face Spaces
 ```bash
+openenv push
+# or
+openenv push --repo-id my-org/prana-env --private
 ```
+After deployment:
+- **Web UI**: `/web`
+- **API docs**: `/docs`
+- **Health**: `/health`
+- **WebSocket**: `/ws`

client.py CHANGED Viewed

@@ -10,20 +10,7 @@ from .models import PranaAction, PranaObservation
 class PranaEnv(EnvClient[PranaAction, PranaObservation, State]):
-    """
-    Client for PRANA-Env.
-    Example:
-        >>> with PranaEnv(base_url="http://localhost:8000") as client:
-        ...     client.reset()
-        ...     result = client.step(PranaAction(
-        ...         action_type="query_db",
-        ...         target="PatientDB",
-        ...         field="hba1c",
-        ...         patient_id="P001",
-        ...     ))
-        ...     print(result.observation.query_result)  # "7.2"
-    """
     def _step_payload(self, action: PranaAction) -> Dict:
         return {k: v for k, v in action.model_dump().items() if v is not None}
@@ -34,6 +21,9 @@ class PranaEnv(EnvClient[PranaAction, PranaObservation, State]):
             query_result=obs_data.get("query_result", ""),
             active_task=obs_data.get("active_task", "t1"),
             policy_alerts=obs_data.get("policy_alerts", ""),
             done=payload.get("done", False),
             reward=payload.get("reward", 0.0),
             metadata=obs_data.get("metadata", {}),

 class PranaEnv(EnvClient[PranaAction, PranaObservation, State]):
+    """Client for PRANA-Env."""
     def _step_payload(self, action: PranaAction) -> Dict:
         return {k: v for k, v in action.model_dump().items() if v is not None}
             query_result=obs_data.get("query_result", ""),
             active_task=obs_data.get("active_task", "t1"),
             policy_alerts=obs_data.get("policy_alerts", ""),
+            kars_result=obs_data.get("kars_result"),
+            missing_fields=obs_data.get("missing_fields", []),
+            recorded_fields=obs_data.get("recorded_fields", {}),
             done=payload.get("done", False),
             reward=payload.get("reward", 0.0),
             metadata=obs_data.get("metadata", {}),

data/patient_db.json CHANGED Viewed

@@ -5,20 +5,36 @@
       "name": "Jane Doe",
       "age": 52,
       "blood_type": "A+",
-      "hba1c": 7.2,
-      "gfr": 18.5,
-      "creatinine": 3.8,
-      "pra": 12
     },
     "P002": {
       "patient_id": "P002",
       "name": "John Smith",
       "age": 61,
       "blood_type": "O-",
-      "hba1c": 9.1,
-      "gfr": 11.0,
-      "creatinine": 5.2,
-      "pra": 45
     },
     "P003": {
       "patient_id": "P003",
@@ -26,9 +42,17 @@
       "age": 47,
       "blood_type": "B+",
       "hba1c": null,
-      "gfr": 22.3,
-      "creatinine": 3.1,
-      "pra": 8
     }
   }
 }

       "name": "Jane Doe",
       "age": 52,
       "blood_type": "A+",
+      "hba1c": 8.9,
+      "gfr": 12.1,
+      "creatinine": 4.7,
+      "pra": 12,
+      "t1_snapshot": {
+        "hba1c": 7.2,
+        "gfr": 18.5,
+        "creatinine": 3.8,
+        "blood_type": "A+",
+        "pra": 12,
+        "recorded_at": "2025-11-07"
+      }
     },
     "P002": {
       "patient_id": "P002",
       "name": "John Smith",
       "age": 61,
       "blood_type": "O-",
+      "hba1c": 10.2,
+      "gfr": 8.3,
+      "creatinine": 6.1,
+      "pra": 45,
+      "t1_snapshot": {
+        "hba1c": 9.1,
+        "gfr": 11.0,
+        "creatinine": 5.2,
+        "blood_type": "O-",
+        "pra": 45,
+        "recorded_at": "2025-11-07"
+      }
     },
     "P003": {
       "patient_id": "P003",
       "age": 47,
       "blood_type": "B+",
       "hba1c": null,
+      "gfr": 19.8,
+      "creatinine": 3.4,
+      "pra": 8,
+      "t1_snapshot": {
+        "hba1c": null,
+        "gfr": 22.3,
+        "creatinine": 3.1,
+        "blood_type": "B+",
+        "pra": 8,
+        "recorded_at": "2025-11-07"
+      }
     }
   }
 }

models.py CHANGED Viewed

@@ -5,7 +5,7 @@ Action space and observation space for the kidney transplant
 administration environment.
 """
-from typing import Optional
 from pydantic import Field
@@ -16,24 +16,19 @@ class PranaAction(Action):
     """
     Action for PRANA-Env.
-    For the smoke test, only action_type='query_db' is supported.
-    Example:
-        >>> action = PranaAction(
-        ...     action_type="query_db",
-        ...     target="PatientDB",
-        ...     field="hba1c",
-        ...     patient_id="P001",
-        ... )
     """
     action_type: str = Field(
         ...,
         description=(
-            "Type of action: query_db | record_value | update_past_record "
-            "| search_policy | infer_from_evidence | file_report | advance_task"
         ),
     )
     target: Optional[str] = Field(
         default=None,
         description="Datastore name for query_db (PatientDB, ClinicalNotesDB, PharmacyDB, WaitlistDB)",
@@ -44,8 +39,12 @@ class PranaAction(Action):
     patient_id: Optional[str] = Field(
         default=None, description="Patient identifier"
     )
     value: Optional[str] = Field(
-        default=None, description="Value to record (for record_value / update_past_record)"
     )
     task_ref: Optional[str] = Field(
         default=None, description="Task reference for retroactive updates (e.g. 't1')"
@@ -58,8 +57,6 @@ class PranaAction(Action):
 class PranaObservation(Observation):
     """
     Observation from PRANA-Env.
-    Contains the result of the last action plus episode context.
     """
     query_result: str = Field(
@@ -74,3 +71,16 @@ class PranaObservation(Observation):
         default="",
         description="Any OPTN policy rules triggered by this observation",
     )

 administration environment.
 """
+from typing import List, Optional
 from pydantic import Field
     """
     Action for PRANA-Env.
+    Supported action_types:
+      query_db       — retrieve a field from a datastore
+      record_value   — write a field into the episode patient record
+      file_report    — submit compiled record to KARS validator
     """
     action_type: str = Field(
         ...,
         description=(
+            "Type of action: query_db | record_value | file_report"
         ),
     )
+    # query_db / record_value
     target: Optional[str] = Field(
         default=None,
         description="Datastore name for query_db (PatientDB, ClinicalNotesDB, PharmacyDB, WaitlistDB)",
     patient_id: Optional[str] = Field(
         default=None, description="Patient identifier"
     )
+    # record_value / update_past_record
     value: Optional[str] = Field(
+        default=None, description="Value to record"
+    )
+    source: Optional[str] = Field(
+        default=None, description="Source datastore the value was retrieved from"
     )
     task_ref: Optional[str] = Field(
         default=None, description="Task reference for retroactive updates (e.g. 't1')"
 class PranaObservation(Observation):
     """
     Observation from PRANA-Env.
     """
     query_result: str = Field(
         default="",
         description="Any OPTN policy rules triggered by this observation",
     )
+    # Populated after file_report
+    kars_result: Optional[str] = Field(
+        default=None,
+        description="KARS validation result: PASSED or FAILED",
+    )
+    missing_fields: List[str] = Field(
+        default_factory=list,
+        description="Fields missing from the report per KARS requirements",
+    )
+    recorded_fields: dict = Field(
+        default_factory=dict,
+        description="Current patient record — fields recorded so far this episode",
+    )

server/prana_env_environment.py CHANGED Viewed

@@ -1,14 +1,34 @@
 """
 PRANA-Env Environment Implementation.
-Kidney transplant administration RL environment built on OpenEnv.
-Phase 1 smoke test: supports query_db action against PatientDB.
 """
-import json
 import logging
 from pathlib import Path
 from uuid import uuid4
 from openenv.core.env_server.interfaces import Environment
 from openenv.core.env_server.types import State
@@ -18,22 +38,63 @@ from models import PranaAction, PranaObservation
 tag = "[prana_env/environment]"
 logger = logging.getLogger(__name__)
-# Path to data directory — resolved relative to this file
 DATA_DIR = Path(__file__).parent.parent / "data"
-class PranaEnvironment(Environment):
     """
-    PRANA-Env: kidney transplant administration environment.
-    Episode structure (5 tasks):
-      t1: Initial Labs       — query PatientDB (HbA1c, GFR, creatinine)
-      t2: Waitlist Update    — query/update WaitlistDB
-      t3: Medication Review  — query PharmacyDB
-      t4: Physician Notes    — query ClinicalNotesDB
-      t5: SRTR Report Filing — file_report → KARS validator
-    Phase 1 smoke test: query_db against PatientDB only.
     """
     SUPPORTS_CONCURRENT_SESSIONS: bool = True
@@ -43,26 +104,88 @@ class PranaEnvironment(Environment):
         self._state = State(episode_id=str(uuid4()), step_count=0)
         self._active_task = "t1"
         self._patient_id: str | None = None
         self._patient_db = self._load_db("patient_db.json")
         logger.info(f"{tag} Loaded PatientDB with {len(self._patient_db.get('patients', {}))} patients")
     def _load_db(self, filename: str) -> dict:
         path = DATA_DIR / filename
-        logger.info(f"{tag} Loading datastore from {path}")
         with open(path) as f:
             return json.load(f)
     def reset(self, seed: int | None = None, episode_id: str | None = None, **kwargs) -> PranaObservation:
         patient_id: str | None = kwargs.get("patient_id")
         self._state = State(episode_id=episode_id or str(uuid4()), step_count=0)
         self._active_task = "t1"
         self._patient_id = patient_id
-        logger.info(f"{tag} reset — episode={self._state.episode_id} patient_id={patient_id}")
         return PranaObservation(
-            query_result="Episode reset. Ready for task t1: Initial Labs.",
             active_task=self._active_task,
             done=False,
             reward=0.0,
         )
@@ -71,83 +194,244 @@ class PranaEnvironment(Environment):
         self._state.step_count += 1
         logger.info(
             f"{tag} step={self._state.step_count} action_type={action.action_type} "
-            f"target={action.target} field={action.field} patient_id={action.patient_id}"
         )
         if action.action_type == "query_db":
             return self._handle_query_db(action)
         logger.warning(f"{tag} Unsupported action_type={action.action_type}")
         return PranaObservation(
-            query_result=f"NOT_SUPPORTED: action_type '{action.action_type}' not implemented yet.",
             active_task=self._active_task,
             done=False,
             reward=0.0,
         )
     def _handle_query_db(self, action: PranaAction) -> PranaObservation:
         db_name = (action.target or "").lower()
         field = (action.field or "").lower()
         patient_id = action.patient_id or self._patient_id
-        logger.info(f"{tag} query_db db={db_name} field={field} patient_id={patient_id}")
         if db_name != "patientdb":
-            logger.warning(f"{tag} Datastore '{db_name}' not available in Phase 1")
             return PranaObservation(
-                query_result=f"NOT_AVAILABLE: datastore '{action.target}' not loaded in Phase 1.",
                 active_task=self._active_task,
                 done=False,
                 reward=0.0,
             )
         if not patient_id:
             return PranaObservation(
-                query_result="ERROR: patient_id is required for query_db.",
                 active_task=self._active_task,
                 done=False,
                 reward=0.0,
             )
         patients = self._patient_db.get("patients", {})
         patient = patients.get(patient_id)
         if not patient:
-            logger.info(f"{tag} patient_id={patient_id} NOT_FOUND in PatientDB")
             return PranaObservation(
                 query_result=f"NOT_FOUND: patient '{patient_id}' not in PatientDB.",
                 active_task=self._active_task,
                 done=False,
                 reward=0.0,
             )
-        if field not in patient:
-            logger.info(f"{tag} field={field} NOT_FOUND for patient={patient_id}")
             return PranaObservation(
-                query_result=f"NOT_FOUND: field '{field}' not in PatientDB for patient '{patient_id}'.",
                 active_task=self._active_task,
                 done=False,
                 reward=0.0,
             )
-        value = patient[field]
-        if value is None:
-            logger.info(f"{tag} field={field} is NULL for patient={patient_id}")
             return PranaObservation(
-                query_result=f"NOT_FOUND: field '{field}' has no recorded value for patient '{patient_id}'.",
                 active_task=self._active_task,
                 done=False,
                 reward=0.0,
             )
-        logger.info(f"{tag} query_db success field={field} value={value} patient={patient_id}")
         return PranaObservation(
-            query_result=str(value),
             active_task=self._active_task,
             done=False,
             reward=0.0,
         )
     @property
     def state(self) -> State:
         return self._state

 """
 PRANA-Env Environment Implementation.
+Minimal RL loop:
+  1. query_db     — retrieve field from PatientDB
+  2. record_value — write field into episode patient record
+  3. file_report  — KARS validator → reward signal → episode done
+Reward:
+  +15  KARS PASSED on first attempt
+  +10  KARS PASSED after prior failed attempt
+   -1  query_db for a field already fresh in the record (inefficiency penalty)
+   -5  file_report with missing or stale required fields
+  -10  unrecoverable KARS failure (max filing attempts exceeded)
+Stochasticity (4 sources):
+  1. T1 date randomization   — T1 age sampled Uniform(T1_AGE_MIN, T1_AGE_MAX) days
+                               Agent must calculate staleness dynamically, not memorize
+  2. Random patient selection — if no patient_id given, pick randomly from pool
+  3. Anomaly injection        — with ANOMALY_PROB, inject a spurious reading for one
+                               time-sensitive field; agent must detect and escalate
+  4. Field availability noise — with PENDING_PROB, a field returns PENDING on first
+                               query; resolved on retry (simulates data entry lag)
 """
 import logging
+import random
+from datetime import date, timedelta
 from pathlib import Path
 from uuid import uuid4
+import json
 from openenv.core.env_server.interfaces import Environment
 from openenv.core.env_server.types import State
 tag = "[prana_env/environment]"
 logger = logging.getLogger(__name__)
 DATA_DIR = Path(__file__).parent.parent / "data"
+# KARS required fields
+KARS_REQUIRED_FIELDS = ["hba1c", "gfr", "creatinine", "blood_type"]
+TIME_SENSITIVE_FIELDS = {"hba1c", "gfr", "creatinine"}
+STABLE_FIELDS = {"blood_type", "pra"}
+MAX_FILE_ATTEMPTS = 3
+# Temporal constants
+EPISODE_DATE = date(2026, 3, 7)
+RECENCY_DAYS = 90
+# ── Stochasticity parameters ──────────────────────────────────────────────────
+T1_AGE_MIN_DAYS = 60       # shortest possible T1 record age (fresh — no re-query needed)
+T1_AGE_MAX_DAYS = 150      # longest possible T1 record age (stale — must re-query)
+ANOMALY_PROB = 0.30        # probability of injecting anomalous reading per episode
+ANOMALY_DELTA = 0.40       # anomalous value deviates by this fraction from true T5
+ANOMALY_WINDOW_DAYS = 14   # anomaly detection window (matches OPTN Clinical Integrity Policy)
+ANOMALY_THRESHOLD = 0.25   # flag if delta > 25% within window
+PENDING_PROB = 0.15        # probability of PENDING response on first query of a field
+def kars_validate(record: dict) -> tuple[bool, list[str]]:
     """
+    Deterministic KARS validator with recency checks.
+    record values: {field: {"value": ..., "recorded_at": "YYYY-MM-DD"}}
+    Returns (passed, issues).
+    """
+    cutoff = EPISODE_DATE - timedelta(days=RECENCY_DAYS)
+    issues = []
+    for f in KARS_REQUIRED_FIELDS:
+        entry = record.get(f)
+        if entry is None or entry.get("value") is None:
+            issues.append(f"{f} (missing)")
+            continue
+        if f in TIME_SENSITIVE_FIELDS:
+            try:
+                recorded_at = date.fromisoformat(entry.get("recorded_at", ""))
+                if recorded_at < cutoff:
+                    issues.append(f"{f} (stale: recorded {recorded_at}, must be after {cutoff})")
+            except ValueError:
+                issues.append(f"{f} (invalid date)")
+    return (len(issues) == 0, issues)
+class PranaEnvironment(Environment):
+    """
+    PRANA-Env: kidney transplant administration RL environment.
+    Stochastic per-episode:
+      - T1 record age varies (60–150 days) — agent must calculate recency dynamically
+      - Patient selected randomly if not specified
+      - One time-sensitive field may have an injected anomalous reading (30% episodes)
+      - Some fields return PENDING on first query (15% per field) — retry resolves
     """
     SUPPORTS_CONCURRENT_SESSIONS: bool = True
         self._state = State(episode_id=str(uuid4()), step_count=0)
         self._active_task = "t1"
         self._patient_id: str | None = None
+        self._patient_record: dict = {}
+        self._file_attempts: int = 0
+        self._t1_date: date = EPISODE_DATE - timedelta(days=120)
+        self._pending_fields: set = set()
+        self._injected_anomaly: dict | None = None
         self._patient_db = self._load_db("patient_db.json")
         logger.info(f"{tag} Loaded PatientDB with {len(self._patient_db.get('patients', {}))} patients")
     def _load_db(self, filename: str) -> dict:
         path = DATA_DIR / filename
         with open(path) as f:
             return json.load(f)
+    def _make_entry(self, value, recorded_at: date) -> dict:
+        return {"value": str(value), "recorded_at": recorded_at.isoformat()}
     def reset(self, seed: int | None = None, episode_id: str | None = None, **kwargs) -> PranaObservation:
         patient_id: str | None = kwargs.get("patient_id")
+        patients = self._patient_db.get("patients", {})
+        # ── Stochasticity 2: random patient selection ─────────────────────────
+        if not patient_id:
+            patient_id = random.choice(list(patients.keys()))
+            logger.info(f"{tag} No patient_id specified — randomly selected {patient_id}")
         self._state = State(episode_id=episode_id or str(uuid4()), step_count=0)
         self._active_task = "t1"
         self._patient_id = patient_id
+        self._patient_record = {}
+        self._file_attempts = 0
+        self._pending_fields = set()
+        self._injected_anomaly = None
+        # ── Stochasticity 1: randomize T1 record age ──────────────────────────
+        t1_days_ago = random.randint(T1_AGE_MIN_DAYS, T1_AGE_MAX_DAYS)
+        self._t1_date = EPISODE_DATE - timedelta(days=t1_days_ago)
+        cutoff = EPISODE_DATE - timedelta(days=RECENCY_DAYS)
+        t1_is_stale = self._t1_date < cutoff
+        # Pre-populate record with T1 snapshot at randomized date
+        patient = patients.get(patient_id, {})
+        snapshot = patient.get("t1_snapshot", {})
+        for field in KARS_REQUIRED_FIELDS:
+            val = snapshot.get(field)
+            if val is not None:
+                self._patient_record[field] = self._make_entry(val, self._t1_date)
+        # ── Stochasticity 3: anomaly injection ────────────────────────────────
+        if random.random() < ANOMALY_PROB:
+            field = random.choice(sorted(TIME_SENSITIVE_FIELDS))
+            t5_value = patient.get(field)
+            if t5_value is not None:
+                direction = random.choice([-1, 1])
+                anomaly_value = round(t5_value * (1 + direction * ANOMALY_DELTA), 1)
+                anomaly_days = random.randint(1, 6)
+                self._injected_anomaly = {
+                    "field": field,
+                    "value": anomaly_value,
+                    "recorded_at": (EPISODE_DATE - timedelta(days=anomaly_days)).isoformat(),
+                }
+                logger.info(f"{tag} Injected anomaly: {self._injected_anomaly}")
+        logger.info(
+            f"{tag} reset episode={self._state.episode_id} patient={patient_id} "
+            f"t1_date={self._t1_date} t1_stale={t1_is_stale} "
+            f"anomaly={self._injected_anomaly}"
+        )
+        stale_note = (
+            f"T1 record is {'STALE (>90 days)' if t1_is_stale else 'FRESH (≤90 days)'}."
+        )
         return PranaObservation(
+            query_result=(
+                f"Episode reset. Patient: {patient_id}. "
+                f"Filing date: {EPISODE_DATE}. "
+                f"T1 record date: {self._t1_date} ({t1_days_ago} days ago). {stale_note} "
+                f"Required fields: {KARS_REQUIRED_FIELDS}. "
+                f"Time-sensitive {sorted(TIME_SENSITIVE_FIELDS)} must be recorded after {cutoff}."
+            ),
             active_task=self._active_task,
+            recorded_fields=self._patient_record.copy(),
             done=False,
             reward=0.0,
         )
         self._state.step_count += 1
         logger.info(
             f"{tag} step={self._state.step_count} action_type={action.action_type} "
+            f"field={action.field} value={action.value}"
         )
         if action.action_type == "query_db":
             return self._handle_query_db(action)
+        if action.action_type == "record_value":
+            return self._handle_record_value(action)
+        if action.action_type == "file_report":
+            return self._handle_file_report(action)
         logger.warning(f"{tag} Unsupported action_type={action.action_type}")
         return PranaObservation(
+            query_result=f"NOT_SUPPORTED: action_type '{action.action_type}'.",
             active_task=self._active_task,
+            recorded_fields=self._patient_record.copy(),
             done=False,
             reward=0.0,
         )
+    # ── Action handlers ───────────────────────────────────────────────────────
     def _handle_query_db(self, action: PranaAction) -> PranaObservation:
         db_name = (action.target or "").lower()
         field = (action.field or "").lower()
         patient_id = action.patient_id or self._patient_id
         if db_name != "patientdb":
             return PranaObservation(
+                query_result=f"NOT_AVAILABLE: datastore '{action.target}' not in Phase 1.",
                 active_task=self._active_task,
+                recorded_fields=self._patient_record.copy(),
                 done=False,
                 reward=0.0,
             )
         if not patient_id:
             return PranaObservation(
+                query_result="ERROR: patient_id required.",
                 active_task=self._active_task,
+                recorded_fields=self._patient_record.copy(),
                 done=False,
                 reward=0.0,
             )
+        # Inefficiency penalty — field already fresh in record
+        cutoff = EPISODE_DATE - timedelta(days=RECENCY_DAYS)
+        if field in self._patient_record:
+            entry = self._patient_record[field]
+            try:
+                recorded_at = date.fromisoformat(entry.get("recorded_at", ""))
+                if field in STABLE_FIELDS or recorded_at >= cutoff:
+                    logger.info(f"{tag} field={field} already fresh — inefficiency penalty")
+                    return PranaObservation(
+                        query_result=f"ALREADY_RECORDED: '{field}' = {entry['value']} (recorded {entry['recorded_at']})",
+                        active_task=self._active_task,
+                        recorded_fields=self._patient_record.copy(),
+                        done=False,
+                        reward=-1.0,
+                    )
+            except ValueError:
+                pass
         patients = self._patient_db.get("patients", {})
         patient = patients.get(patient_id)
         if not patient:
             return PranaObservation(
                 query_result=f"NOT_FOUND: patient '{patient_id}' not in PatientDB.",
                 active_task=self._active_task,
+                recorded_fields=self._patient_record.copy(),
                 done=False,
                 reward=0.0,
             )
+        # ── Stochasticity 4: field availability noise (PENDING) ───────────────
+        if field in TIME_SENSITIVE_FIELDS and field not in self._pending_fields:
+            if random.random() < PENDING_PROB:
+                self._pending_fields.add(field)
+                logger.info(f"{tag} field={field} returned PENDING (will resolve on retry)")
+                return PranaObservation(
+                    query_result=(
+                        f"PENDING: '{field}' not yet entered for patient '{patient_id}'. "
+                        f"Data entry in progress — retry."
+                    ),
+                    active_task=self._active_task,
+                    recorded_fields=self._patient_record.copy(),
+                    done=False,
+                    reward=0.0,
+                )
+        value = patient.get(field)
+        if value is None:
             return PranaObservation(
+                query_result=f"NOT_FOUND: '{field}' has no value for patient '{patient_id}'.",
                 active_task=self._active_task,
+                recorded_fields=self._patient_record.copy(),
                 done=False,
                 reward=0.0,
             )
+        # ── Stochasticity 3: include anomaly in history if injected ───────────
+        if field in TIME_SENSITIVE_FIELDS:
+            query_result = self._format_lab_history(field, patient_id, value)
+        else:
+            query_result = str(value)
+        logger.info(f"{tag} query_db OK field={field} value={value}")
+        return PranaObservation(
+            query_result=query_result,
+            active_task=self._active_task,
+            recorded_fields=self._patient_record.copy(),
+            done=False,
+            reward=0.0,
+        )
+    def _format_lab_history(self, field: str, patient_id: str, t5_value) -> str:
+        """
+        Format a time-sensitive field as a timestamped history.
+        Includes T1 snapshot entry, T5 current entry, and injected anomaly if present.
+        Flags anomalies per OPTN Clinical Integrity Policy.
+        """
+        snapshot = self._patient_db["patients"][patient_id].get("t1_snapshot", {})
+        t1_val = snapshot.get(field)
+        history: list[tuple[date, float]] = []
+        if t1_val is not None:
+            history.append((self._t1_date, float(t1_val)))
+        # Inject anomalous reading if this is the affected field
+        if self._injected_anomaly and self._injected_anomaly["field"] == field:
+            anom_date = date.fromisoformat(self._injected_anomaly["recorded_at"])
+            history.append((anom_date, self._injected_anomaly["value"]))
+        history.append((EPISODE_DATE, float(t5_value)))
+        history.sort(key=lambda x: x[0])
+        lines = []
+        for i, (d, v) in enumerate(history):
+            suffix = " ← latest" if i == len(history) - 1 else ""
+            lines.append(f"  {v} (recorded: {d}){suffix}")
+        result = (
+            f"{field} measurement history for {patient_id} "
+            f"(filing date: {EPISODE_DATE}):\n" + "\n".join(lines)
+        )
+        # Check for anomaly between consecutive entries within window
+        for i in range(len(history) - 1):
+            d1, v1 = history[i]
+            d2, v2 = history[i + 1]
+            days_apart = (d2 - d1).days
+            if days_apart <= ANOMALY_WINDOW_DAYS and v1 > 0:
+                change = abs(v2 - v1) / v1
+                if change >= ANOMALY_THRESHOLD:
+                    pct = round(change * 100, 1)
+                    result += (
+                        f"\n⚠️ ANOMALY DETECTED: {v1} ({d1}) → {v2} ({d2}), "
+                        f"{days_apart} days apart, {pct}% delta. "
+                        f"Recommend confirmatory test before filing."
+                    )
+        return result
+    def _handle_record_value(self, action: PranaAction) -> PranaObservation:
+        field = (action.field or "").lower()
+        value = action.value
+        if not field or value is None:
             return PranaObservation(
+                query_result="ERROR: field and value are required for record_value.",
                 active_task=self._active_task,
+                recorded_fields=self._patient_record.copy(),
                 done=False,
                 reward=0.0,
             )
+        self._patient_record[field] = self._make_entry(value, EPISODE_DATE)
+        logger.info(f"{tag} record_value field={field} value={value}")
+        required_fresh = sum(
+            1 for f in KARS_REQUIRED_FIELDS
+            if f in self._patient_record and self._patient_record[f].get("value") is not None
+        )
         return PranaObservation(
+            query_result=(
+                f"RECORDED: {field} = {value} (as of {EPISODE_DATE}). "
+                f"Record has {required_fresh}/{len(KARS_REQUIRED_FIELDS)} required fields."
+            ),
             active_task=self._active_task,
+            recorded_fields=self._patient_record.copy(),
             done=False,
             reward=0.0,
         )
+    def _handle_file_report(self, action: PranaAction) -> PranaObservation:
+        self._file_attempts += 1
+        passed, issues = kars_validate(self._patient_record)
+        logger.info(
+            f"{tag} file_report attempt={self._file_attempts} "
+            f"passed={passed} issues={issues}"
+        )
+        if passed:
+            reward = 15.0 if self._file_attempts == 1 else 10.0
+            logger.info(f"{tag} KARS PASSED reward={reward}")
+            return PranaObservation(
+                query_result="KARS PASSED. SRTR report accepted.",
+                active_task=self._active_task,
+                kars_result="PASSED",
+                missing_fields=[],
+                recorded_fields=self._patient_record.copy(),
+                done=True,
+                reward=reward,
+            )
+        if self._file_attempts >= MAX_FILE_ATTEMPTS:
+            logger.warning(f"{tag} KARS FAILED unrecoverable after {self._file_attempts} attempts")
+            return PranaObservation(
+                query_result=f"KARS FAILED (unrecoverable). Issues: {issues}",
+                active_task=self._active_task,
+                kars_result="FAILED",
+                missing_fields=issues,
+                recorded_fields=self._patient_record.copy(),
+                done=True,
+                reward=-10.0,
+            )
+        logger.info(f"{tag} KARS FAILED recoverable issues={issues}")
+        return PranaObservation(
+            query_result=f"KARS FAILED. Issues: {issues}. Fix and file again.",
+            active_task=self._active_task,
+            kars_result="FAILED",
+            missing_fields=issues,
+            recorded_fields=self._patient_record.copy(),
+            done=False,
+            reward=-5.0,
+        )
     @property
     def state(self) -> State:
         return self._state

test_agent.py ADDED Viewed

	@@ -0,0 +1,227 @@

+"""
+PRANA-Env agent with full minimal RL loop.
+The LLM agent must:
+  1. query_db      — retrieve required fields from PatientDB
+  2. record_value  — write each field into the episode record
+  3. file_report   — submit to KARS validator → reward → done
+Reward signal:
+  +15  KARS PASSED first attempt
+  +10  KARS PASSED after correction
+   -1  redundant query (field already recorded)
+   -5  filed with missing fields (recoverable)
+  -10  unrecoverable failure
+"""
+import json
+import openai
+from dataclasses import dataclass, field
+from typing import Optional
+from prana_env.client import PranaEnv
+from prana_env.models import PranaAction
+# ── Tool definitions ──────────────────────────────────────────────────────────
+TOOLS = [
+    {
+        "type": "function",
+        "function": {
+            "name": "query_db",
+            "description": "Retrieve a specific field from a clinical datastore for a patient.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "target":     {"type": "string", "description": "PatientDB | ClinicalNotesDB | PharmacyDB | WaitlistDB"},
+                    "field":      {"type": "string", "description": "Field name (e.g. hba1c, gfr, creatinine, blood_type)"},
+                    "patient_id": {"type": "string", "description": "Patient identifier (e.g. P001)"},
+                },
+                "required": ["target", "field", "patient_id"],
+            },
+        },
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "record_value",
+            "description": "Write a retrieved field value into the episode patient record.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "field":  {"type": "string", "description": "Field name to record"},
+                    "value":  {"type": "string", "description": "Value to record"},
+                    "source": {"type": "string", "description": "Datastore the value came from"},
+                },
+                "required": ["field", "value"],
+            },
+        },
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "file_report",
+            "description": (
+                "Submit the compiled patient record to the KARS validator. "
+                "Returns PASSED (done) or FAILED with missing fields. "
+                "Call only after recording all required fields: hba1c, gfr, creatinine, blood_type."
+            ),
+            "parameters": {"type": "object", "properties": {}, "required": []},
+        },
+    },
+]
+SYSTEM_PROMPT = """You are a kidney transplant administrative agent.
+Your goal is to compile a complete patient record and file a KARS-compliant SRTR report.
+Required fields: hba1c, gfr, creatinine, blood_type (all from PatientDB).
+KARS Recency Policy:
+- Time-sensitive fields (hba1c, gfr, creatinine) must be recorded within 90 days of the filing date.
+- Stable fields (blood_type) have no recency requirement.
+- The episode starts with a pre-existing T1 record (~4 months old). These values are STALE.
+- You must re-query and re-record hba1c, gfr, and creatinine before filing.
+- Do NOT re-query blood_type — it is stable and already valid.
+Workflow:
+1. Check recorded_fields in the observation — identify stale time-sensitive fields.
+2. Use query_db to retrieve fresh values for stale fields only.
+3. Use record_value to write each fresh value into the patient record.
+4. Use file_report to submit. If it fails due to stale or missing fields, fix and retry.
+Do not guess values. Always query before recording."""
+# ── Trajectory dataclass ──────────────────────────────────────────────────────
+@dataclass
+class Step:
+    action: dict
+    observation: str
+    reward: float
+    done: bool
+@dataclass
+class Trajectory:
+    episode_id: str
+    steps: list[Step] = field(default_factory=list)
+    @property
+    def total_reward(self) -> float:
+        return sum(s.reward for s in self.steps)
+    def __repr__(self):
+        terminal = next((s for s in reversed(self.steps) if s.done), None)
+        kars = terminal.observation if terminal else "incomplete"
+        return (
+            f"Trajectory(episode={self.episode_id}, "
+            f"steps={len(self.steps)}, "
+            f"total_reward={self.total_reward}, "
+            f"outcome={kars!r})"
+        )
+# ── RL primitives ─────────────────────────────────────────────────────────────
+def reset(env: PranaEnv, patient_id: str) -> str:
+    result = env.reset(patient_id=patient_id)
+    return result.observation.query_result
+def step(env: PranaEnv, action_type: str, **kwargs) -> tuple[str, float, bool, list]:
+    result = env.step(PranaAction(action_type=action_type, **kwargs))
+    obs    = result.observation
+    return (
+        obs.query_result,
+        obs.reward or 0.0,
+        obs.done or False,
+        obs.missing_fields or [],
+    )
+def rollout(env: PranaEnv, task: str, patient_id: str, episode_id: str, max_turns: int = 20) -> Trajectory:
+    """Run one full episode. LLM drives the action loop until done=True."""
+    llm = openai.OpenAI()
+    messages = [
+        {"role": "system", "content": SYSTEM_PROMPT},
+        {"role": "user",   "content": task},
+    ]
+    trajectory = Trajectory(episode_id=episode_id)
+    print(f"\n── Episode {episode_id} ──────────────────────────────")
+    print(f"Task: {task}")
+    initial_obs = reset(env, patient_id)
+    print(f"[reset] {initial_obs}")
+    for turn in range(max_turns):
+        response = llm.chat.completions.create(
+            model="gpt-4o",
+            tools=TOOLS,
+            messages=messages,
+        )
+        msg = response.choices[0].message
+        messages.append(msg)
+        # No tool calls → LLM finished without filing (shouldn't happen with good prompt)
+        if msg.tool_calls is None:
+            print(f"[turn {turn+1}] Agent: {msg.content}")
+            trajectory.steps.append(Step(
+                action={"type": "end_turn"},
+                observation=msg.content or "",
+                reward=0.0,
+                done=True,
+            ))
+            break
+        for tool_call in msg.tool_calls:
+            action_type = tool_call.function.name
+            inp = json.loads(tool_call.function.arguments)
+            print(f"[turn {turn+1}] {action_type}({json.dumps(inp)})")
+            obs_str, reward, done, missing = step(env, action_type, **inp)
+            print(f"[turn {turn+1}] obs={obs_str!r}  reward={reward}  done={done}")
+            trajectory.steps.append(Step(
+                action={"type": action_type, **inp},
+                observation=obs_str,
+                reward=reward,
+                done=done,
+            ))
+            messages.append({
+                "role": "tool",
+                "tool_call_id": tool_call.id,
+                "content": obs_str,
+            })
+            if done:
+                return trajectory
+    return trajectory
+def run_episodes(task: str, patient_id: str, n: int = 1) -> list[Trajectory]:
+    """Run N independent episodes. Set n=8 for GRPO rollout batch."""
+    trajectories = []
+    with PranaEnv(base_url="http://localhost:8000") as env:
+        for i in range(n):
+            traj = rollout(env, task, patient_id, episode_id=f"ep_{i+1}")
+            trajectories.append(traj)
+    print(f"\n── Summary ({n} episode(s)) ──────────────────────────")
+    for t in trajectories:
+        print(f"  {t}")
+    return trajectories
+# ── Entry point ───────────────────────────────────────────────────────────────
+if __name__ == "__main__":
+    run_episodes(
+        task=(
+            "File a KARS-compliant SRTR report for patient P001. "
+            "A T1 record exists from 4 months ago. "
+            "Check which fields are stale, re-query only what's needed, and file."
+        ),
+        patient_id="P001",
+        n=1,  # set n=8 for GRPO rollout batch
+    )

test_client.py CHANGED Viewed

@@ -1,13 +1,16 @@
 from prana_env.client import PranaEnv
 from prana_env.models import PranaAction
-with PranaEnv(base_url="http://localhost:8000") as client:
-	client.reset()
-	result = client.step(PranaAction(action_type="query_db",
-								    target="PatientDB",
-                                    field="hba1c",
-                                    patient_id="P001",
-    ))
-	print(result.observation.query_result)

+import asyncio
 from prana_env.client import PranaEnv
 from prana_env.models import PranaAction
+async def main():
+    async with PranaEnv(base_url="http://localhost:8000") as client:
+        await client.reset()
+        result = await client.step(PranaAction(
+            action_type="query_db",
+            target="PatientDB",
+            field="hba1c",
+            patient_id="P001",
+        ))
+        print(result.observation.query_result)
+asyncio.run(main())