# PayOps Environment — Test Cases & Testing Guide This document covers every testable behaviour of the PayOps OpenEnv, organised by endpoint and scenario. Each test shows the exact command to run, the expected response, and what a failure looks like. --- ## Prerequisites ```bash # 1. Start the server (run from /Users/padmapriya) PYTHONPATH=/Users/padmapriya uvicorn payops_env.server.app:app --host 0.0.0.0 --port 8000 # 2. Confirm it is up (should return {"status":"ok",...}) curl -s http://localhost:8000/health ``` All `curl` commands below assume the server is running on `localhost:8000`. --- ## T-01 Health Check **Goal:** Confirm the server is alive and returns version metadata. ```bash curl -s http://localhost:8000/health ``` **Expected output** ```json {"status": "ok", "environment": "payops_env", "version": "2.0.0"} ``` **Failure indicator:** Connection refused, or any field missing / wrong value. --- ## T-02 Schema Endpoint **Goal:** Verify that action, observation, and state JSON schemas are served correctly. ```bash curl -s http://localhost:8000/schema | python3 -m json.tool ``` **Expected output (condensed)** ```json { "action": { "title": "PayOpsAction", "type": "object", ... }, "observation": { "title": "PayOpsObservation", "type": "object", ... }, "state": { "title": "PayOpsState", "type": "object", ... } } ``` **Checks to verify manually:** - `action.properties` includes `action_type`, `transaction_id`, `reason`, `confidence` - `observation.properties` includes `risk_score`, `flags`, `kyc_status`, `velocity_1h` - HTTP status code is `200` --- ## T-03 Tasks Endpoint **Goal:** Confirm all 20 tasks are returned with the correct difficulty distribution. ```bash curl -s http://localhost:8000/tasks | python3 -c " import sys, json d = json.load(sys.stdin) print('Total tasks:', d['count']) from collections import Counter c = Counter(t['difficulty'] for t in d['tasks']) print('By difficulty:', dict(c)) print() for t in d['tasks']: print(f\" {t['task_id']:12} [{t['difficulty']:8}] correct={t['correct_action']}\") " ``` **Expected output** ``` Total tasks: 20 By difficulty: {'easy': 4, 'medium': 6, 'hard': 6, 'critical': 4} EASY-001 [easy ] correct=approve EASY-002 [easy ] correct=reject EASY-003 [easy ] correct=approve EASY-004 [easy ] correct=flag MED-001 [medium ] correct=escalate MED-002 [medium ] correct=hold MED-003 [medium ] correct=flag MED-004 [medium ] correct=flag MED-005 [medium ] correct=hold MED-006 [medium ] correct=escalate HARD-001 [hard ] correct=escalate HARD-002 [hard ] correct=reject HARD-003 [hard ] correct=reject HARD-004 [hard ] correct=approve HARD-005 [hard ] correct=escalate HARD-006 [hard ] correct=flag CRIT-001 [critical] correct=approve CRIT-002 [critical] correct=reject CRIT-003 [critical] correct=escalate CRIT-004 [critical] correct=reject ``` > Note: correct_action values for jitter-variant tasks (EASY-004, MED-001/003/004/006, > HARD-001/006, CRIT-001/003/004) may differ per episode seed — the above shows default values. **Failure indicator:** count != 20, missing difficulty tier, wrong correct_action. --- ## T-04 Reset **Goal:** Reset the environment and confirm the first task is EASY-001. ```bash curl -s -X POST http://localhost:8000/reset | python3 -c " import sys, json d = json.load(sys.stdin) print('task_id :', d['task_id']) print('transaction_id :', d['transaction_id']) print('difficulty :', d['task_difficulty']) print('status :', d['status']) print('done :', d['done']) print('reward :', d['reward']) print('cumulative_reward :', d['cumulative_reward']) print('risk_score :', d['risk_score']) " ``` **Expected output** ``` task_id : EASY-001 transaction_id : TXN-E001 difficulty : easy status : pending done : false reward : 0.0 cumulative_reward : 0.0 risk_score : 0.05 ``` **Failure indicator:** `done=true`, `reward != 0`, wrong `task_id`. --- ## T-05 Correct Action — Full Credit (+1.0) **Goal:** Submit the correct action for EASY-001 (`approve`) and receive reward +1.0. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null # fresh start curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"approve","transaction_id":"TXN-E001"}' \ | python3 -c " import sys, json d = json.load(sys.stdin) print('reward :', d['reward']) print('correct info :', d['info'].get('correct_action')) print('action taken :', d['info'].get('action_taken')) " ``` **Expected output** ``` reward : 1.0 correct info : approve action taken : approve ``` --- ## T-06 Wrong Action — Penalty (approve on fraud = -1.0) **Goal:** Skip to EASY-002 (textbook fraud) and approve it. Expect -1.0 penalty. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null # Step past EASY-001 curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"approve","transaction_id":"TXN-E001"}' > /dev/null # Now on EASY-002 (correct=reject). Try approving it. curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"approve","transaction_id":"TXN-E002"}' \ | python3 -c " import sys, json d = json.load(sys.stdin) print('reward :', d['reward']) print('correct was :', d['info'].get('correct_action')) " ``` **Expected output** ``` reward : -1.0 correct was : reject ``` --- ## T-07 Partial Credit Action **Goal:** On MED-001 (correct=`escalate`), submit `flag` — should earn +0.5 partial credit. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null # Step through EASY tasks 1-4 with any actions for ACTION in approve reject approve reject; do curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d "{\"action_type\":\"$ACTION\",\"transaction_id\":\"dummy\"}" > /dev/null done # Now on MED-001 (correct=escalate). Submit flag. curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"flag","transaction_id":"TXN-M001"}' \ | python3 -c " import sys, json d = json.load(sys.stdin) print('reward :', d['reward']) print('correct was :', d['info'].get('correct_action')) print('partial? :', 0 < d['reward'] < 1.0) " ``` **Expected output** ``` reward : 0.5 correct was : escalate partial? : True ``` --- ## T-08 Inspect Action — Information Reveal **Goal:** Use `inspect` on EASY-001 to receive investigation notes and a small reward (+0.15). The episode should NOT advance (still on same transaction). ```bash curl -s -X POST http://localhost:8000/reset > /dev/null curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"inspect","transaction_id":"TXN-E001"}' \ | python3 -c " import sys, json d = json.load(sys.stdin) print('reward :', d['reward']) print('status :', d['status']) print('task_id :', d['task_id']) # should still be EASY-001 print('inspection_notes :', d['inspection_notes']) " ``` **Expected output** ``` reward : 0.15 status : inspected task_id : EASY-001 inspection_notes : Sender account opened 3 years ago. Consistent transaction history. KYC fully verified. ``` --- ## T-09 Double Inspect — No Double-Dipping **Goal:** Inspect the same transaction twice. Second inspect should return reward 0.0 (already inspected). ```bash curl -s -X POST http://localhost:8000/reset > /dev/null # First inspect — reward 0.15 curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"inspect","transaction_id":"TXN-E001"}' \ | python3 -c "import sys,json; d=json.load(sys.stdin); print('First inspect reward:', d['reward'])" # Second inspect — reward 0.0 curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"inspect","transaction_id":"TXN-E001"}' \ | python3 -c "import sys,json; d=json.load(sys.stdin); print('Second inspect reward:', d['reward'])" ``` **Expected output** ``` First inspect reward: 0.15 Second inspect reward: 0.0 ``` --- ## T-10 Invalid Action Type **Goal:** Send an unsupported action type and receive a 422 validation error. ```bash curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"delete","transaction_id":"TXN-E001"}' \ | python3 -c "import sys,json; d=json.load(sys.stdin); print('status_code:', d.get('detail','')[:60])" ``` **Expected output** ``` status_code: Invalid action_type 'delete'. Valid values: ['approve', 'escal ``` HTTP status code should be `422`. --- ## T-11 Step Without Reset **Goal:** Call `/step` without calling `/reset` first. Should return a `400` error. ```bash # Kill and restart server to guarantee clean state # Then immediately step without reset: curl -s -o /dev/null -w "%{http_code}" -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"approve","transaction_id":"TXN-E001"}' ``` **Expected output** ``` 400 ``` --- ## T-12 State Endpoint Tracking **Goal:** Confirm `/state` reflects the episode progress correctly. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null curl -s http://localhost:8000/state | python3 -c " import sys, json d = json.load(sys.stdin) print('step_count :', d['step_count']) print('transactions_processed:', d['transactions_processed']) print('total_tasks :', d['total_tasks']) print('done :', d['done']) " # Take one step curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type":"approve","transaction_id":"TXN-E001"}' > /dev/null curl -s http://localhost:8000/state | python3 -c " import sys, json d = json.load(sys.stdin) print('step_count :', d['step_count']) print('transactions_processed:', d['transactions_processed']) print('last_action :', d['last_action']) print('cumulative_reward :', d['cumulative_reward']) " ``` **Expected output (before step)** ``` step_count : 0 transactions_processed: 0 total_tasks : 12 done : false ``` **Expected output (after step)** ``` step_count : 1 transactions_processed: 1 last_action : approve cumulative_reward : 1.0 ``` --- ## T-13 Complete Episode — Done Flag **Goal:** Step through all 12 tasks and confirm `done=true` on the last step. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null python3 - <<'EOF' import httpx, asyncio BASE = "http://localhost:8000" ACTIONS = [ "approve","reject","approve","flag", # easy "escalate","hold","flag","flag", # medium "escalate","reject","reject","approve" # hard (perfect sequence) ] client = httpx.Client() txn_ids = [t["transaction_id"] for t in client.get(f"{BASE}/tasks").json()["tasks"]] for i, (action, txn) in enumerate(zip(ACTIONS, txn_ids)): resp = client.post(f"{BASE}/step", json={"action_type": action, "transaction_id": txn}).json() print(f"Step {i+1:2d} {txn:12} action={action:10} reward={resp['reward']:+.2f} done={resp['done']}") client.close() EOF ``` **Expected output (last line)** ``` Step 12 TXN-H004 action=approve reward=+1.00 done=True ``` All other steps should show `done=False`. --- ## T-14 Grader Endpoint **Goal:** Grade and score the episode immediately after completing all steps. ```bash # Run the perfect sequence first (T-13 above), then: curl -s http://localhost:8000/grader | python3 -c " import sys, json d = json.load(sys.stdin) print('total_reward :', d['total_reward']) print('max_possible :', d['max_possible_reward']) print('normalised_score :', d['normalised_score']) print('passed :', d['passed']) print() for t in d['per_task']: mark = '✓' if t['correct'] else '✗' print(f\" {mark} {t['task_id']:12} action={t['action_taken']:10} correct={t['correct_action']:10} reward={t['reward']:+.2f}\") " ``` **Expected output (perfect run)** ``` total_reward : 12.0 max_possible : 12.0 normalised_score : 1.0 passed : True ✓ EASY-001 action=approve correct=approve reward=+1.00 ✓ EASY-002 action=reject correct=reject reward=+1.00 ... ✓ HARD-004 action=approve correct=approve reward=+1.00 ``` --- ## T-15 Grader Without Episode **Goal:** Call `/grader` before any steps — should return a 400 error. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null curl -s http://localhost:8000/grader ``` **Expected output** ```json {"error": "No actions recorded. Run /reset then /step first."} ``` --- ## T-16 Baseline Endpoint **Goal:** Confirm `/baseline` runs the rule-based agent and returns a normalised score ≥ 0.5. ```bash curl -s -X POST http://localhost:8000/baseline | python3 -c " import sys, json d = json.load(sys.stdin) print('normalised_score :', d['normalised_score']) print('total_reward :', d['total_reward']) print('steps :', d['steps']) print('passed (>=0.5) :', d['normalised_score'] >= 0.5) print() for t in d['scores']: mark = '✓' if t['correct'] else '✗' print(f\" {mark} {t['task_id']:12} [{t['difficulty']:6}] action={t['action_taken']:10} reward={t['reward']:+.2f}\") " ``` **Expected output** ``` normalised_score : 0.7292 total_reward : 8.75 steps : 12 passed (>=0.5) : True ✓ EASY-001 [easy ] action=approve reward=+1.00 ✓ EASY-002 [easy ] action=reject reward=+1.00 ... ``` --- ## T-17 Optional Reason and Confidence Fields **Goal:** Submit an action with optional `reason` and `confidence`. Should be accepted without error. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null curl -s -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{ "action_type": "flag", "transaction_id": "TXN-E001", "reason": "Low risk but flagging for audit trail verification", "confidence": 0.82 }' \ | python3 -c "import sys,json; d=json.load(sys.stdin); print('HTTP OK, reward:', d['reward'])" ``` **Expected output** ``` HTTP OK, reward: 0.3 ``` (0.3 = partial credit for flagging when correct action is approve) --- ## T-18 Hard Task — Model Poisoning (HARD-001) **Goal:** Demonstrate that the low risk_score (0.18) is a trap. The correct action is `escalate`. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null # Advance to HARD-001 (task 9) — step through easy + medium with dummy actions python3 - <<'EOF' import httpx BASE = "http://localhost:8000" c = httpx.Client() tasks = c.get(f"{BASE}/tasks").json()["tasks"] # Steps 1-8: easy + medium for t in tasks[:8]: c.post(f"{BASE}/step", json={"action_type": "approve", "transaction_id": t["transaction_id"]}) # Step 9: HARD-001 — agent trusts the low risk_score and approves (wrong!) resp_wrong = c.post(f"{BASE}/step", json={ "action_type": "approve", "transaction_id": "TXN-H001" }).json() print("Trusted ML score → approve") print(" reward :", resp_wrong["reward"]) # expect -0.5 print(" correct :", resp_wrong["info"]["correct_action"]) # Reset and do it correctly c.post(f"{BASE}/reset") for t in tasks[:8]: c.post(f"{BASE}/step", json={"action_type": "approve", "transaction_id": t["transaction_id"]}) resp_correct = c.post(f"{BASE}/step", json={ "action_type": "escalate", "transaction_id": "TXN-H001" }).json() print("\nOverrode ML score → escalate") print(" reward :", resp_correct["reward"]) # expect +1.0 c.close() EOF ``` **Expected output** ``` Trusted ML score → approve reward : -0.5 correct : escalate Overrode ML score → escalate reward : 1.0 ``` --- ## T-19 Inspect Reveals Hidden Context (HARD-001) **Goal:** Inspect HARD-001 to reveal the mule-account intelligence note before deciding. ```bash curl -s -X POST http://localhost:8000/reset > /dev/null # Advance to HARD-001 python3 - <<'EOF' import httpx BASE = "http://localhost:8000" c = httpx.Client() tasks = c.get(f"{BASE}/tasks").json()["tasks"] for t in tasks[:8]: c.post(f"{BASE}/step", json={"action_type": "approve", "transaction_id": t["transaction_id"]}) # Inspect HARD-001 resp = c.post(f"{BASE}/step", json={ "action_type": "inspect", "transaction_id": "TXN-H001" }).json() print("Inspect reward :", resp["reward"]) print("Notes :", resp["inspection_notes"]) c.close() EOF ``` **Expected output** ``` Inspect reward : 0.15 Notes : Account created 7 days ago. This is the first outbound transfer. Receiver matches a pattern of solicitor-impersonation mule accounts flagged in last month's intelligence bulletin. Risk model underscored due to clean transaction history (new account). ``` --- ## T-20 WebSocket Session **Goal:** Run a full reset → step sequence over the WebSocket endpoint. ```bash pip install websockets -q # if not already installed python3 - <<'EOF' import asyncio, json, websockets async def test_ws(): uri = "ws://localhost:8000/ws" async with websockets.connect(uri) as ws: # Reset await ws.send(json.dumps({"type": "reset"})) obs = json.loads(await ws.recv()) print("Reset →", obs["transaction_id"], "risk:", obs["risk_score"]) # Step – approve await ws.send(json.dumps({ "type": "step", "action_type": "approve", "transaction_id": obs["transaction_id"] })) obs2 = json.loads(await ws.recv()) print("Step →", "reward:", obs2["reward"], "next:", obs2["transaction_id"]) # State await ws.send(json.dumps({"type": "state"})) state = json.loads(await ws.recv()) print("State →", "steps:", state["step_count"], "txns:", state["transactions_processed"]) asyncio.run(test_ws()) EOF ``` **Expected output** ``` Reset → TXN-E001 risk: 0.05 Step → reward: 1.0 next: TXN-E002 State → steps: 1 txns: 1 ``` --- ## T-21 Baseline Agent Script (Standalone) **Goal:** Run the standalone Python baseline script independently of the server. ```bash cd /Users/padmapriya PYTHONPATH=/Users/padmapriya python3 payops_env/scripts/baseline_agent.py ``` **Expected output (last few lines)** ``` ============================================================ Episode Summary ============================================================ Steps : 12 Total reward : +8.75 Max possible : 12.00 Normalised score : 0.7292 Passed (≥0.5) : YES ✓ ============================================================ ``` --- ## T-22 All Actions Are Valid on Each Task **Goal:** Confirm every action type is accepted without error (even if penalised). ```bash python3 - <<'EOF' import httpx BASE = "http://localhost:8000" ACTIONS = ["approve", "reject", "flag", "escalate", "inspect", "hold"] c = httpx.Client() for action in ACTIONS: c.post(f"{BASE}/reset") resp = c.post(f"{BASE}/step", json={ "action_type": action, "transaction_id": "TXN-E001" }) print(f"action={action:10} HTTP={resp.status_code} reward={resp.json()['reward']:+.2f}") c.close() EOF ``` **Expected output** ``` action=approve HTTP=200 reward=+1.00 action=reject HTTP=200 reward=-0.50 action=flag HTTP=200 reward=+0.30 action=escalate HTTP=200 reward=-0.25 action=inspect HTTP=200 reward=+0.15 action=hold HTTP=200 reward=-0.25 ``` --- ## T-23 Full Perfect Episode (Score = 1.0) **Goal:** Submit all 12 correct actions and confirm normalised_score = 1.0. ```bash python3 - <<'EOF' import httpx BASE = "http://localhost:8000" c = httpx.Client() tasks = c.get(f"{BASE}/tasks").json()["tasks"] c.post(f"{BASE}/reset") for t in tasks: resp = c.post(f"{BASE}/step", json={ "action_type": t["correct_action"], "transaction_id": t["transaction_id"] }).json() mark = "✓" if resp["reward"] == 1.0 else "✗" print(f"{mark} {t['task_id']:12} action={t['correct_action']:10} reward={resp['reward']:+.2f}") score = c.get(f"{BASE}/grader").json() print() print("Normalised score:", score["normalised_score"]) print("Passed :", score["passed"]) c.close() EOF ``` **Expected output** ``` ✓ EASY-001 action=approve reward=+1.00 ✓ EASY-002 action=reject reward=+1.00 ✓ EASY-003 action=approve reward=+1.00 ✓ EASY-004 action=flag reward=+1.00 ✓ MED-001 action=escalate reward=+1.00 ✓ MED-002 action=hold reward=+1.00 ✓ MED-003 action=flag reward=+1.00 ✓ MED-004 action=flag reward=+1.00 ✓ HARD-001 action=escalate reward=+1.00 ✓ HARD-002 action=reject reward=+1.00 ✓ HARD-003 action=reject reward=+1.00 ✓ HARD-004 action=approve reward=+1.00 Normalised score: 1.0 Passed : True ``` --- ## T-24 Worst-Case Episode (Approve Everything) **Goal:** Approve all 12 transactions (maximally wrong) and confirm very low score. ```bash python3 - <<'EOF' import httpx BASE = "http://localhost:8000" c = httpx.Client() tasks = c.get(f"{BASE}/tasks").json()["tasks"] c.post(f"{BASE}/reset") total = 0 for t in tasks: resp = c.post(f"{BASE}/step", json={ "action_type": "approve", "transaction_id": t["transaction_id"] }).json() total += resp["reward"] print(f"{t['task_id']:12} correct={t['correct_action']:10} reward={resp['reward']:+.2f}") score = c.get(f"{BASE}/grader").json() print() print(f"Total reward : {total:+.2f}") print(f"Normalised score : {score['normalised_score']}") print(f"Passed : {score['passed']}") c.close() EOF ``` **Expected outcome:** Several `-1.0` and `-0.5` penalties. Normalised score near or equal to `0.0`. `passed=False`. --- ## Quick Reference — Expected Rewards Per Action | Scenario | Action | Reward | |----------|--------|--------| | Correct decision | any | `+1.0` | | Inspect (first time) | `inspect` | `+0.15` | | Inspect (already inspected) | `inspect` | `0.0` | | Partial credit (task-specific) | adjacent | `+0.2` – `+0.6` | | Approve fraud/escalation | `approve` | `-1.0` | | Approve flagged/held | `approve` | `-0.5` | | Reject legitimate tx | `reject` | `-0.5` | | Any other wrong action | any | `-0.25` | --- ## Quick Reference — Correct Actions per Task | Task ID | Difficulty | Correct Action | Key Signal | |---------|-----------|----------------|------------| | EASY-001 | easy | `approve` | risk=0.05, no flags, verified KYC | | EASY-002 | easy | `reject` | sanctioned country, unknown sender, risk=0.97 | | EASY-003 | easy | `approve` | small refund, risk=0.03, verified | | EASY-004 | easy | `flag` | velocity_1h=15 (ATM burst) | | MED-001 | medium | `escalate` | large B2B, cross-border, medium-risk country | | MED-002 | medium | `hold` | KYC pending, large internal transfer | | MED-003 | medium | `flag` | amount 3× historical average | | MED-004 | medium | `flag` | crypto exchange, moderate risk | | HARD-001 | hard | `escalate` | risk_score=0.18 is poisoned — manual flags say escalate | | HARD-002 | hard | `reject` | APP scam, mule account pattern | | HARD-003 | hard | `reject` | structuring/smurfing, KYC failed | | HARD-004 | hard | `approve` | legitimate FX settlement — looks scary, is fine | --- ## Running All Tests in One Script Save the following as `run_tests.sh` and execute from `/Users/padmapriya`: ```bash #!/usr/bin/env bash # run_tests.sh — smoke-test all PayOps endpoints set -e BASE="http://localhost:8000" PASS=0 FAIL=0 check() { local name="$1" local got="$2" local want="$3" if echo "$got" | grep -q "$want"; then echo " ✓ $name" ((PASS++)) else echo " ✗ $name (expected '$want', got '$got')" ((FAIL++)) fi } echo "=== PayOps Test Suite ===" check "T-01 health" "$(curl -s $BASE/health)" '"status":"ok"' check "T-02 schema" "$(curl -s $BASE/schema)" '"PayOpsAction"' check "T-03 tasks count" "$(curl -s $BASE/tasks)" '"count":12' check "T-04 reset" "$(curl -s -X POST $BASE/reset)" '"task_id":"EASY-001"' check "T-05 correct step" "$(curl -s -X POST $BASE/step -H 'Content-Type: application/json' -d '{"action_type":"approve","transaction_id":"TXN-E001"}')" '"reward":1.0' check "T-10 invalid action" "$(curl -s -X POST $BASE/step -H 'Content-Type: application/json' -d '{"action_type":"delete","transaction_id":"TXN-E001"}')" "Invalid action_type" check "T-16 baseline" "$(curl -s -X POST $BASE/baseline)" '"normalised_score"' echo "" echo "Results: $PASS passed, $FAIL failed" ``` ```bash cd /Users/padmapriya bash payops_env/run_tests.sh ``` **Expected output** ``` === PayOps Test Suite === ✓ T-01 health ✓ T-02 schema ✓ T-03 tasks count ✓ T-04 reset ✓ T-05 correct step ✓ T-10 invalid action ✓ T-16 baseline Results: 7 passed, 0 failed ``` --- ## Interactive API Explorer FastAPI serves auto-generated interactive docs. Open in a browser while the server is running: ``` http://localhost:8000/docs ← Swagger UI (try endpoints in-browser) http://localhost:8000/redoc ← ReDoc documentation ```