Architecture
RAG QA Command Center is a self-contained Streamlit application for reviewing offline RAG evaluation logs. It is intentionally deterministic: packaged CSV artifacts are loaded locally, validated at startup, transformed into review views, and exported without external services.
Runtime flow
app.py
-> src.dashboard.CommandCenterApp
-> src.data.DataRepository.load()
-> src.data.validate_bundle()
-> src.analytics.*
-> src.charts.*
-> src.views.*
-> Streamlit UI / downloads
Main layers
| Layer | Files | Responsibility |
|---|---|---|
| Entry point | app.py |
Thin launcher only. |
| Controller | src/dashboard.py |
Page orchestration, filters, context construction, and shared rendering helpers. |
| View mixins | src/views/ |
One focused module per tab/page. |
| Data access | src/data.py |
CSV discovery, loading, standardization, schema and integrity validation. |
| Analytics | src/analytics.py |
Metrics, risk slices, retrieval outcomes, config scoring, policy curves, trace queues. |
| Charts | src/charts.py |
Plotly figure factories. |
| UI primitives | src/ui.py |
Reusable Streamlit/HTML components with controlled markup. |
| State models | src/app_state.py, src/models.py |
Typed app context and runtime settings. |
| Tests | tests/ |
Data contracts, numeric analytics regressions, Streamlit smoke coverage, project hygiene, and release checks. |
Design choices
- Separation of concerns: data loading, analytics, charts, and UI pages are separated.
- Thin entrypoint:
app.pyonly starts the application. - View composition: the main controller inherits focused view mixins instead of growing a monolithic UI file.
- Fail-fast data contract: packaged tables are validated before the UI renders, including key presence, references, strict numeric conversion, metric ranges, relevance flags, and rank sanity checks.
- Offline-first review: the app reviews fixed evaluation artifacts and does not call live LLMs.
What this is not
This project is not a live RAG serving platform. It does not include online ingestion, vector indexing, authentication, scheduled jobs, alerting, or production incident automation.