payops_env

Paused

File size: 25,617 Bytes

# PayOps Environment — Test Cases & Testing Guide

This document covers every testable behaviour of the PayOps OpenEnv, organised
by endpoint and scenario. Each test shows the exact command to run, the
expected response, and what a failure looks like.

---

## Prerequisites

```bash
# 1. Start the server (run from /Users/padmapriya)
PYTHONPATH=/Users/padmapriya uvicorn payops_env.server.app:app --host 0.0.0.0 --port 8000

# 2. Confirm it is up (should return {"status":"ok",...})
curl -s http://localhost:8000/health
```

All `curl` commands below assume the server is running on `localhost:8000`.

---

## T-01  Health Check

**Goal:** Confirm the server is alive and returns version metadata.

```bash
curl -s http://localhost:8000/health
```

**Expected output**
```json
{"status": "ok", "environment": "payops_env", "version": "2.0.0"}
```

**Failure indicator:** Connection refused, or any field missing / wrong value.

---

## T-02  Schema Endpoint

**Goal:** Verify that action, observation, and state JSON schemas are served correctly.

```bash
curl -s http://localhost:8000/schema | python3 -m json.tool
```

**Expected output (condensed)**
```json
{
  "action":      { "title": "PayOpsAction", "type": "object", ... },
  "observation": { "title": "PayOpsObservation", "type": "object", ... },
  "state":       { "title": "PayOpsState", "type": "object", ... }
}
```

**Checks to verify manually:**
- `action.properties` includes `action_type`, `transaction_id`, `reason`, `confidence`
- `observation.properties` includes `risk_score`, `flags`, `kyc_status`, `velocity_1h`
- HTTP status code is `200`

---

## T-03  Tasks Endpoint

**Goal:** Confirm all 20 tasks are returned with the correct difficulty distribution.

```bash
curl -s http://localhost:8000/tasks | python3 -c "
import sys, json
d = json.load(sys.stdin)
print('Total tasks:', d['count'])
from collections import Counter
c = Counter(t['difficulty'] for t in d['tasks'])
print('By difficulty:', dict(c))
print()
for t in d['tasks']:
    print(f\"  {t['task_id']:12} [{t['difficulty']:8}] correct={t['correct_action']}\")
"
```

**Expected output**
```
Total tasks: 20
By difficulty: {'easy': 4, 'medium': 6, 'hard': 6, 'critical': 4}

  EASY-001     [easy    ] correct=approve
  EASY-002     [easy    ] correct=reject
  EASY-003     [easy    ] correct=approve
  EASY-004     [easy    ] correct=flag
  MED-001      [medium  ] correct=escalate
  MED-002      [medium  ] correct=hold
  MED-003      [medium  ] correct=flag
  MED-004      [medium  ] correct=flag
  MED-005      [medium  ] correct=hold
  MED-006      [medium  ] correct=escalate
  HARD-001     [hard    ] correct=escalate
  HARD-002     [hard    ] correct=reject
  HARD-003     [hard    ] correct=reject
  HARD-004     [hard    ] correct=approve
  HARD-005     [hard    ] correct=escalate
  HARD-006     [hard    ] correct=flag
  CRIT-001     [critical] correct=approve
  CRIT-002     [critical] correct=reject
  CRIT-003     [critical] correct=escalate
  CRIT-004     [critical] correct=reject
```

> Note: correct_action values for jitter-variant tasks (EASY-004, MED-001/003/004/006,
> HARD-001/006, CRIT-001/003/004) may differ per episode seed — the above shows default values.

**Failure indicator:** count != 20, missing difficulty tier, wrong correct_action.

---

## T-04  Reset

**Goal:** Reset the environment and confirm the first task is EASY-001.

```bash
curl -s -X POST http://localhost:8000/reset | python3 -c "
import sys, json
d = json.load(sys.stdin)
print('task_id          :', d['task_id'])
print('transaction_id   :', d['transaction_id'])
print('difficulty        :', d['task_difficulty'])
print('status            :', d['status'])
print('done              :', d['done'])
print('reward            :', d['reward'])
print('cumulative_reward :', d['cumulative_reward'])
print('risk_score        :', d['risk_score'])
"
```

**Expected output**
```
task_id          : EASY-001
transaction_id   : TXN-E001
difficulty        : easy
status            : pending
done              : false
reward            : 0.0
cumulative_reward : 0.0
risk_score        : 0.05
```

**Failure indicator:** `done=true`, `reward != 0`, wrong `task_id`.

---

## T-05  Correct Action — Full Credit (+1.0)

**Goal:** Submit the correct action for EASY-001 (`approve`) and receive reward +1.0.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null   # fresh start

curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"approve","transaction_id":"TXN-E001"}' \
| python3 -c "
import sys, json
d = json.load(sys.stdin)
print('reward           :', d['reward'])
print('correct info     :', d['info'].get('correct_action'))
print('action taken     :', d['info'].get('action_taken'))
"
```

**Expected output**
```
reward           : 1.0
correct info     : approve
action taken     : approve
```

---

## T-06  Wrong Action — Penalty (approve on fraud = -1.0)

**Goal:** Skip to EASY-002 (textbook fraud) and approve it. Expect -1.0 penalty.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null
# Step past EASY-001
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"approve","transaction_id":"TXN-E001"}' > /dev/null

# Now on EASY-002 (correct=reject). Try approving it.
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"approve","transaction_id":"TXN-E002"}' \
| python3 -c "
import sys, json
d = json.load(sys.stdin)
print('reward       :', d['reward'])
print('correct was  :', d['info'].get('correct_action'))
"
```

**Expected output**
```
reward       : -1.0
correct was  : reject
```

---

## T-07  Partial Credit Action

**Goal:** On MED-001 (correct=`escalate`), submit `flag` — should earn +0.5 partial credit.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null
# Step through EASY tasks 1-4 with any actions
for ACTION in approve reject approve reject; do
  curl -s -X POST http://localhost:8000/step \
    -H "Content-Type: application/json" \
    -d "{\"action_type\":\"$ACTION\",\"transaction_id\":\"dummy\"}" > /dev/null
done

# Now on MED-001 (correct=escalate). Submit flag.
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"flag","transaction_id":"TXN-M001"}' \
| python3 -c "
import sys, json
d = json.load(sys.stdin)
print('reward       :', d['reward'])
print('correct was  :', d['info'].get('correct_action'))
print('partial?     :', 0 < d['reward'] < 1.0)
"
```

**Expected output**
```
reward       : 0.5
correct was  : escalate
partial?     : True
```

---

## T-08  Inspect Action — Information Reveal

**Goal:** Use `inspect` on EASY-001 to receive investigation notes and a small reward (+0.15). The episode should NOT advance (still on same transaction).

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null

curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"inspect","transaction_id":"TXN-E001"}' \
| python3 -c "
import sys, json
d = json.load(sys.stdin)
print('reward            :', d['reward'])
print('status            :', d['status'])
print('task_id           :', d['task_id'])    # should still be EASY-001
print('inspection_notes  :', d['inspection_notes'])
"
```

**Expected output**
```
reward            : 0.15
status            : inspected
task_id           : EASY-001
inspection_notes  : Sender account opened 3 years ago. Consistent transaction history. KYC fully verified.
```

---

## T-09  Double Inspect — No Double-Dipping

**Goal:** Inspect the same transaction twice. Second inspect should return reward 0.0 (already inspected).

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null

# First inspect — reward 0.15
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"inspect","transaction_id":"TXN-E001"}' \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('First inspect reward:', d['reward'])"

# Second inspect — reward 0.0
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"inspect","transaction_id":"TXN-E001"}' \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('Second inspect reward:', d['reward'])"
```

**Expected output**
```
First inspect reward: 0.15
Second inspect reward: 0.0
```

---

## T-10  Invalid Action Type

**Goal:** Send an unsupported action type and receive a 422 validation error.

```bash
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"delete","transaction_id":"TXN-E001"}' \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('status_code:', d.get('detail','')[:60])"
```

**Expected output**
```
status_code: Invalid action_type 'delete'. Valid values: ['approve', 'escal
```

HTTP status code should be `422`.

---

## T-11  Step Without Reset

**Goal:** Call `/step` without calling `/reset` first. Should return a `400` error.

```bash
# Kill and restart server to guarantee clean state
# Then immediately step without reset:
curl -s -o /dev/null -w "%{http_code}" -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"approve","transaction_id":"TXN-E001"}'
```

**Expected output**
```
400
```

---

## T-12  State Endpoint Tracking

**Goal:** Confirm `/state` reflects the episode progress correctly.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null

curl -s http://localhost:8000/state | python3 -c "
import sys, json
d = json.load(sys.stdin)
print('step_count            :', d['step_count'])
print('transactions_processed:', d['transactions_processed'])
print('total_tasks           :', d['total_tasks'])
print('done                  :', d['done'])
"

# Take one step
curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{"action_type":"approve","transaction_id":"TXN-E001"}' > /dev/null

curl -s http://localhost:8000/state | python3 -c "
import sys, json
d = json.load(sys.stdin)
print('step_count            :', d['step_count'])
print('transactions_processed:', d['transactions_processed'])
print('last_action           :', d['last_action'])
print('cumulative_reward     :', d['cumulative_reward'])
"
```

**Expected output (before step)**
```
step_count            : 0
transactions_processed: 0
total_tasks           : 12
done                  : false
```

**Expected output (after step)**
```
step_count            : 1
transactions_processed: 1
last_action           : approve
cumulative_reward     : 1.0
```

---

## T-13  Complete Episode — Done Flag

**Goal:** Step through all 12 tasks and confirm `done=true` on the last step.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null

python3 - <<'EOF'
import httpx, asyncio

BASE = "http://localhost:8000"
ACTIONS = [
    "approve","reject","approve","flag",   # easy
    "escalate","hold","flag","flag",       # medium
    "escalate","reject","reject","approve" # hard (perfect sequence)
]

client = httpx.Client()
txn_ids = [t["transaction_id"] for t in client.get(f"{BASE}/tasks").json()["tasks"]]

for i, (action, txn) in enumerate(zip(ACTIONS, txn_ids)):
    resp = client.post(f"{BASE}/step", json={"action_type": action, "transaction_id": txn}).json()
    print(f"Step {i+1:2d}  {txn:12}  action={action:10}  reward={resp['reward']:+.2f}  done={resp['done']}")

client.close()
EOF
```

**Expected output (last line)**
```
Step 12  TXN-H004     action=approve      reward=+1.00  done=True
```

All other steps should show `done=False`.

---

## T-14  Grader Endpoint

**Goal:** Grade and score the episode immediately after completing all steps.

```bash
# Run the perfect sequence first (T-13 above), then:
curl -s http://localhost:8000/grader | python3 -c "
import sys, json
d = json.load(sys.stdin)
print('total_reward      :', d['total_reward'])
print('max_possible      :', d['max_possible_reward'])
print('normalised_score  :', d['normalised_score'])
print('passed            :', d['passed'])
print()
for t in d['per_task']:
    mark = '✓' if t['correct'] else '✗'
    print(f\"  {mark} {t['task_id']:12} action={t['action_taken']:10} correct={t['correct_action']:10} reward={t['reward']:+.2f}\")
"
```

**Expected output (perfect run)**
```
total_reward      : 12.0
max_possible      : 12.0
normalised_score  : 1.0
passed            : True

  ✓ EASY-001     action=approve     correct=approve     reward=+1.00
  ✓ EASY-002     action=reject      correct=reject      reward=+1.00
  ...
  ✓ HARD-004     action=approve     correct=approve     reward=+1.00
```

---

## T-15  Grader Without Episode

**Goal:** Call `/grader` before any steps — should return a 400 error.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null
curl -s http://localhost:8000/grader
```

**Expected output**
```json
{"error": "No actions recorded. Run /reset then /step first."}
```

---

## T-16  Baseline Endpoint

**Goal:** Confirm `/baseline` runs the rule-based agent and returns a normalised score ≥ 0.5.

```bash
curl -s -X POST http://localhost:8000/baseline | python3 -c "
import sys, json
d = json.load(sys.stdin)
print('normalised_score :', d['normalised_score'])
print('total_reward     :', d['total_reward'])
print('steps            :', d['steps'])
print('passed (>=0.5)   :', d['normalised_score'] >= 0.5)
print()
for t in d['scores']:
    mark = '✓' if t['correct'] else '✗'
    print(f\"  {mark} {t['task_id']:12} [{t['difficulty']:6}] action={t['action_taken']:10} reward={t['reward']:+.2f}\")
"
```

**Expected output**
```
normalised_score : 0.7292
total_reward     : 8.75
steps            : 12
passed (>=0.5)   : True

  ✓ EASY-001     [easy  ] action=approve     reward=+1.00
  ✓ EASY-002     [easy  ] action=reject      reward=+1.00
  ...
```

---

## T-17  Optional Reason and Confidence Fields

**Goal:** Submit an action with optional `reason` and `confidence`. Should be accepted without error.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null

curl -s -X POST http://localhost:8000/step \
  -H "Content-Type: application/json" \
  -d '{
    "action_type": "flag",
    "transaction_id": "TXN-E001",
    "reason": "Low risk but flagging for audit trail verification",
    "confidence": 0.82
  }' \
| python3 -c "import sys,json; d=json.load(sys.stdin); print('HTTP OK, reward:', d['reward'])"
```

**Expected output**
```
HTTP OK, reward: 0.3
```

(0.3 = partial credit for flagging when correct action is approve)

---

## T-18  Hard Task — Model Poisoning (HARD-001)

**Goal:** Demonstrate that the low risk_score (0.18) is a trap. The correct action is `escalate`.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null
# Advance to HARD-001 (task 9) — step through easy + medium with dummy actions
python3 - <<'EOF'
import httpx
BASE = "http://localhost:8000"
c = httpx.Client()
tasks = c.get(f"{BASE}/tasks").json()["tasks"]

# Steps 1-8: easy + medium
for t in tasks[:8]:
    c.post(f"{BASE}/step", json={"action_type": "approve", "transaction_id": t["transaction_id"]})

# Step 9: HARD-001 — agent trusts the low risk_score and approves (wrong!)
resp_wrong = c.post(f"{BASE}/step", json={
    "action_type": "approve", "transaction_id": "TXN-H001"
}).json()
print("Trusted ML score → approve")
print("  reward   :", resp_wrong["reward"])   # expect -0.5
print("  correct  :", resp_wrong["info"]["correct_action"])

# Reset and do it correctly
c.post(f"{BASE}/reset")
for t in tasks[:8]:
    c.post(f"{BASE}/step", json={"action_type": "approve", "transaction_id": t["transaction_id"]})

resp_correct = c.post(f"{BASE}/step", json={
    "action_type": "escalate", "transaction_id": "TXN-H001"
}).json()
print("\nOverrode ML score → escalate")
print("  reward   :", resp_correct["reward"])  # expect +1.0
c.close()
EOF
```

**Expected output**
```
Trusted ML score → approve
  reward   : -0.5
  correct  : escalate

Overrode ML score → escalate
  reward   : 1.0
```

---

## T-19  Inspect Reveals Hidden Context (HARD-001)

**Goal:** Inspect HARD-001 to reveal the mule-account intelligence note before deciding.

```bash
curl -s -X POST http://localhost:8000/reset > /dev/null
# Advance to HARD-001
python3 - <<'EOF'
import httpx
BASE = "http://localhost:8000"
c = httpx.Client()
tasks = c.get(f"{BASE}/tasks").json()["tasks"]
for t in tasks[:8]:
    c.post(f"{BASE}/step", json={"action_type": "approve", "transaction_id": t["transaction_id"]})

# Inspect HARD-001
resp = c.post(f"{BASE}/step", json={
    "action_type": "inspect", "transaction_id": "TXN-H001"
}).json()
print("Inspect reward :", resp["reward"])
print("Notes          :", resp["inspection_notes"])
c.close()
EOF
```

**Expected output**
```
Inspect reward : 0.15
Notes          : Account created 7 days ago. This is the first outbound transfer. Receiver matches a pattern of solicitor-impersonation mule accounts flagged in last month's intelligence bulletin. Risk model underscored due to clean transaction history (new account).
```

---

## T-20  WebSocket Session

**Goal:** Run a full reset → step sequence over the WebSocket endpoint.

```bash
pip install websockets -q   # if not already installed

python3 - <<'EOF'
import asyncio, json, websockets

async def test_ws():
    uri = "ws://localhost:8000/ws"
    async with websockets.connect(uri) as ws:
        # Reset
        await ws.send(json.dumps({"type": "reset"}))
        obs = json.loads(await ws.recv())
        print("Reset  →", obs["transaction_id"], "risk:", obs["risk_score"])

        # Step – approve
        await ws.send(json.dumps({
            "type": "step",
            "action_type": "approve",
            "transaction_id": obs["transaction_id"]
        }))
        obs2 = json.loads(await ws.recv())
        print("Step   →", "reward:", obs2["reward"], "next:", obs2["transaction_id"])

        # State
        await ws.send(json.dumps({"type": "state"}))
        state = json.loads(await ws.recv())
        print("State  →", "steps:", state["step_count"], "txns:", state["transactions_processed"])

asyncio.run(test_ws())
EOF
```

**Expected output**
```
Reset  → TXN-E001 risk: 0.05
Step   → reward: 1.0 next: TXN-E002
State  → steps: 1 txns: 1
```

---

## T-21  Baseline Agent Script (Standalone)

**Goal:** Run the standalone Python baseline script independently of the server.

```bash
cd /Users/padmapriya
PYTHONPATH=/Users/padmapriya python3 payops_env/scripts/baseline_agent.py
```

**Expected output (last few lines)**
```
============================================================
  Episode Summary
============================================================
  Steps             : 12
  Total reward      : +8.75
  Max possible      : 12.00
  Normalised score  : 0.7292
  Passed (≥0.5)     : YES ✓
============================================================
```

---

## T-22  All Actions Are Valid on Each Task

**Goal:** Confirm every action type is accepted without error (even if penalised).

```bash
python3 - <<'EOF'
import httpx

BASE = "http://localhost:8000"
ACTIONS = ["approve", "reject", "flag", "escalate", "inspect", "hold"]

c = httpx.Client()
for action in ACTIONS:
    c.post(f"{BASE}/reset")
    resp = c.post(f"{BASE}/step", json={
        "action_type": action,
        "transaction_id": "TXN-E001"
    })
    print(f"action={action:10}  HTTP={resp.status_code}  reward={resp.json()['reward']:+.2f}")
c.close()
EOF
```

**Expected output**
```
action=approve     HTTP=200  reward=+1.00
action=reject      HTTP=200  reward=-0.50
action=flag        HTTP=200  reward=+0.30
action=escalate    HTTP=200  reward=-0.25
action=inspect     HTTP=200  reward=+0.15
action=hold        HTTP=200  reward=-0.25
```

---

## T-23  Full Perfect Episode (Score = 1.0)

**Goal:** Submit all 12 correct actions and confirm normalised_score = 1.0.

```bash
python3 - <<'EOF'
import httpx

BASE = "http://localhost:8000"
c = httpx.Client()

tasks = c.get(f"{BASE}/tasks").json()["tasks"]
c.post(f"{BASE}/reset")

for t in tasks:
    resp = c.post(f"{BASE}/step", json={
        "action_type": t["correct_action"],
        "transaction_id": t["transaction_id"]
    }).json()
    mark = "✓" if resp["reward"] == 1.0 else "✗"
    print(f"{mark} {t['task_id']:12} action={t['correct_action']:10} reward={resp['reward']:+.2f}")

score = c.get(f"{BASE}/grader").json()
print()
print("Normalised score:", score["normalised_score"])
print("Passed          :", score["passed"])
c.close()
EOF
```

**Expected output**
```
✓ EASY-001     action=approve     reward=+1.00
✓ EASY-002     action=reject      reward=+1.00
✓ EASY-003     action=approve     reward=+1.00
✓ EASY-004     action=flag        reward=+1.00
✓ MED-001      action=escalate    reward=+1.00
✓ MED-002      action=hold        reward=+1.00
✓ MED-003      action=flag        reward=+1.00
✓ MED-004      action=flag        reward=+1.00
✓ HARD-001     action=escalate    reward=+1.00
✓ HARD-002     action=reject      reward=+1.00
✓ HARD-003     action=reject      reward=+1.00
✓ HARD-004     action=approve     reward=+1.00

Normalised score: 1.0
Passed          : True
```

---

## T-24  Worst-Case Episode (Approve Everything)

**Goal:** Approve all 12 transactions (maximally wrong) and confirm very low score.

```bash
python3 - <<'EOF'
import httpx

BASE = "http://localhost:8000"
c = httpx.Client()

tasks = c.get(f"{BASE}/tasks").json()["tasks"]
c.post(f"{BASE}/reset")

total = 0
for t in tasks:
    resp = c.post(f"{BASE}/step", json={
        "action_type": "approve",
        "transaction_id": t["transaction_id"]
    }).json()
    total += resp["reward"]
    print(f"{t['task_id']:12} correct={t['correct_action']:10} reward={resp['reward']:+.2f}")

score = c.get(f"{BASE}/grader").json()
print()
print(f"Total reward     : {total:+.2f}")
print(f"Normalised score : {score['normalised_score']}")
print(f"Passed           : {score['passed']}")
c.close()
EOF
```

**Expected outcome:** Several `-1.0` and `-0.5` penalties. Normalised score near or equal to `0.0`. `passed=False`.

---

## Quick Reference — Expected Rewards Per Action

| Scenario | Action | Reward |
|----------|--------|--------|
| Correct decision | any | `+1.0` |
| Inspect (first time) | `inspect` | `+0.15` |
| Inspect (already inspected) | `inspect` | `0.0` |
| Partial credit (task-specific) | adjacent | `+0.2` – `+0.6` |
| Approve fraud/escalation | `approve` | `-1.0` |
| Approve flagged/held | `approve` | `-0.5` |
| Reject legitimate tx | `reject` | `-0.5` |
| Any other wrong action | any | `-0.25` |

---

## Quick Reference — Correct Actions per Task

| Task ID | Difficulty | Correct Action | Key Signal |
|---------|-----------|----------------|------------|
| EASY-001 | easy | `approve` | risk=0.05, no flags, verified KYC |
| EASY-002 | easy | `reject` | sanctioned country, unknown sender, risk=0.97 |
| EASY-003 | easy | `approve` | small refund, risk=0.03, verified |
| EASY-004 | easy | `flag` | velocity_1h=15 (ATM burst) |
| MED-001 | medium | `escalate` | large B2B, cross-border, medium-risk country |
| MED-002 | medium | `hold` | KYC pending, large internal transfer |
| MED-003 | medium | `flag` | amount 3× historical average |
| MED-004 | medium | `flag` | crypto exchange, moderate risk |
| HARD-001 | hard | `escalate` | risk_score=0.18 is poisoned — manual flags say escalate |
| HARD-002 | hard | `reject` | APP scam, mule account pattern |
| HARD-003 | hard | `reject` | structuring/smurfing, KYC failed |
| HARD-004 | hard | `approve` | legitimate FX settlement — looks scary, is fine |

---

## Running All Tests in One Script

Save the following as `run_tests.sh` and execute from `/Users/padmapriya`:

```bash
#!/usr/bin/env bash
# run_tests.sh — smoke-test all PayOps endpoints

set -e
BASE="http://localhost:8000"
PASS=0
FAIL=0

check() {
  local name="$1"
  local got="$2"
  local want="$3"
  if echo "$got" | grep -q "$want"; then
    echo "  ✓  $name"
    ((PASS++))
  else
    echo "  ✗  $name  (expected '$want', got '$got')"
    ((FAIL++))
  fi
}

echo "=== PayOps Test Suite ==="

check "T-01 health"        "$(curl -s $BASE/health)"              '"status":"ok"'
check "T-02 schema"        "$(curl -s $BASE/schema)"              '"PayOpsAction"'
check "T-03 tasks count"   "$(curl -s $BASE/tasks)"               '"count":12'
check "T-04 reset"         "$(curl -s -X POST $BASE/reset)"       '"task_id":"EASY-001"'
check "T-05 correct step"  "$(curl -s -X POST $BASE/step -H 'Content-Type: application/json' -d '{"action_type":"approve","transaction_id":"TXN-E001"}')"  '"reward":1.0'
check "T-10 invalid action" "$(curl -s -X POST $BASE/step -H 'Content-Type: application/json' -d '{"action_type":"delete","transaction_id":"TXN-E001"}')" "Invalid action_type"
check "T-16 baseline"      "$(curl -s -X POST $BASE/baseline)"    '"normalised_score"'

echo ""
echo "Results: $PASS passed, $FAIL failed"
```

```bash
cd /Users/padmapriya
bash payops_env/run_tests.sh
```

**Expected output**
```
=== PayOps Test Suite ===
  ✓  T-01 health
  ✓  T-02 schema
  ✓  T-03 tasks count
  ✓  T-04 reset
  ✓  T-05 correct step
  ✓  T-10 invalid action
  ✓  T-16 baseline

Results: 7 passed, 0 failed
```

---

## Interactive API Explorer

FastAPI serves auto-generated interactive docs. Open in a browser while the server is running:

```
http://localhost:8000/docs      ← Swagger UI (try endpoints in-browser)
http://localhost:8000/redoc     ← ReDoc documentation
```