Spaces:
Running
scripts/
Utility scripts bundled with the Headroom repo. Most are one-off operator tools; a few are runnable as part of development workflows.
Reproducing the reconnect storm
repro_codex_replay.py reproduces the multi-agent Codex reconnect/retry storm
against a local Headroom proxy (default http://127.0.0.1:8787), as described
in wiki/plans/2026-04-17-codex-proxy-runtime-analysis.md under "Latest
Correction". Use it to:
- Regression-check that
/livezstays responsive under a cold-start storm. - Empirically tune the Unit 4 pre-upstream semaphore default
(
HEADROOM_ANTHROPIC_PRE_UPSTREAM_CONCURRENCY). - Exercise the Codex WS lifecycle + Anthropic HTTP path simultaneously without needing to replay captured production traffic.
Run
# Default: 8 WS + 4 HTTP clients, 30s storm, p99 /livez must stay <= 500ms.
python scripts/repro_codex_replay.py
# Tighter budget, shorter run:
python scripts/repro_codex_replay.py \
--url http://127.0.0.1:8787 \
--ws-clients 16 \
--anthropic-clients 8 \
--duration 60 \
--livez-threshold-ms 100
# Dump the full summary as JSON for downstream tooling:
python scripts/repro_codex_replay.py --json
Exit code:
0— warmup succeeded (or was skipped), storm ran for the requested duration, and/livezp99 stayed under--livez-threshold-ms.1— soft assertion failed, proxy unreachable, or unhandled exception. Proxy-unreachable is detected and reported within ~5 seconds.
Fixtures
The script loads two hand-crafted, fully synthetic JSON fixtures:
scripts/fixtures/anthropic_replay_body.json— shape of a large agent reconnect replay/v1/messages?beta=truePOST body.scripts/fixtures/codex_response_create_frame.json— first Codex WS frame with the{"type": "response.create", "response": {...}}envelope.
Override via --ws-frame-fixture / --anthropic-body-fixture if you have
captured traffic to replay instead.
Interpretation
/livez p99under threshold means the event loop is not starved during the storm. If it rises with the semaphore unbounded (HEADROOM_ANTHROPIC_PRE_UPSTREAM_CONCURRENCY=10000) and drops back under the default, Unit 4's backpressure is working.Codex WS: openedshould equal--ws-clients.response.completedtypically stays low when upstream auth isn't configured locally — the goal is handshake + relay wiring, not real upstream traffic.Anthropic HTTP: ok_2xx + non_2xx + timed_out + errorsshould roughly equalattempted. Sustained non-zerotimed_outduring the storm is the failure signal the plan targets.
A smoke test at tests/test_scripts/test_repro_codex_replay_smoke.py
exercises the script against a mock FastAPI server on every PR.
Install scripts
install.sh— POSIX installer.install.ps1— Windows PowerShell installer.
These are generated by the release pipeline; edit with care.