| # Testing Strategy |
|
|
| The test suite is designed to protect the repo from regressions that would weaken its value as a reproducible RAG evaluation artifact. |
|
|
| ## Covered areas |
|
|
| - CSV bundle loading and schema checks. |
| - Primary key presence and uniqueness for core tables. |
| - Required foreign-key presence and referential integrity across examples, retrieval events, chunks, documents, and scenarios. |
| - Strict numeric validation and standardization for required and optional numeric fields, including rejection of non-numeric corruption and missing required numeric values. |
| - Metric and policy output contracts. |
| - Numeric regression checks for risk scoring, retrieval outcome classification, evidence-strength proxy normalization and review-weight normalization, and policy monotonicity. |
| - Config leaderboard and risk-slice behavior. |
| - Project hygiene checks for docs, Docker, CI, Streamlit smoke coverage, Docker health-smoke coverage, Trace Explorer literal search behavior, and view separation. |
|
|
| ## Local checks |
|
|
| ```bash |
| make check |
| ``` |
|
|
| This runs: |
|
|
| ```bash |
| ruff check app.py src tests scripts |
| python -m compileall app.py src tests scripts |
| python scripts/run_pytest.py -q |
| python -c "import app; from src.dashboard import CommandCenterApp; CommandCenterApp()" |
| ``` |
|
|
| ## CI checks |
|
|
| GitHub Actions runs lint, compile, tests, and a Streamlit import smoke check on Python 3.11 and 3.12. A separate job builds the Docker image through `scripts/docker_smoke.py`, starts the container, verifies the container remains running, and probes Streamlit's `/_stcore/health` endpoint after the matrix passes. |
|
|
| ## Runtime-binding regression checks |
|
|
| The test suite includes AST-level checks for class methods defined without `self` and without `@staticmethod`. This prevents valid Python syntax from passing CI while failing during Streamlit render paths because of implicit instance binding. |
|
|
| In environments where Streamlit is installed, smoke tests also verify selected controller/view helper methods through an instantiated `CommandCenterApp`. |
|
|