Spaces:
Sleeping
Sleeping
| # Quantitative Portfolio Builder: Architecture & Data Flow | |
| This document maps out the core data processing pipeline, the stochastic optimizer flow, and the post-trade analytics orchestration within the Engine. | |
| ## High-Level Architecture | |
| The Engine separates concerns into distinct layers: Data Ingestion, Risk Modeling, Convex Optimization, and Post-Trade Reporting. | |
| ```mermaid | |
| flowchart TD | |
| %% Define external data sources | |
| db[(PostgreSQL / SQLite)] | |
| yfinance[("Yahoo Finance (API)")] | |
| fred[("FRED (Macro Data)")] | |
| %% Ingestion Layer | |
| subgraph Data Layer ["Data Ingestion & Pre-Processing"] | |
| db --> |"Raw Pricing"| df_fetch(fetch_data) | |
| yfinance -.-> |"Fallback"| df_fetch | |
| fred --> |"Risk Free Rate"| rfr(fetch_risk_free_rate) | |
| df_fetch --> cleaning[Missing Value Imputation] | |
| cleaning --> returns[Calculate Daily Returns] | |
| end | |
| %% Modeling Layer | |
| subgraph Quant Models ["Risk & Return Modeling"] | |
| returns --> ewma[Covariance Estimation] | |
| returns --> garch[GARCH Volatility Regime] | |
| returns --> ff[Fama-French Factor Betas] | |
| returns --> hmm[HMM Regime Detection] | |
| ewma --> rmt[RMT Noise Filtering] | |
| end | |
| %% Optimization Layer | |
| subgraph Solver Engine ["Convex Optimization (cvxpy)"] | |
| direction TB | |
| rmt --> cov[Clean Covariance Matrix] | |
| garch --> cov | |
| cov --> cvx_setup[Build CVX Objective] | |
| ff --> expected_rets[Calculate Expected Returns] | |
| expected_rets --> cvx_setup | |
| cvx_setup --> constraints[Apply Bounds, Sectors, Turnover, Risk Limit] | |
| constraints --> cvx_solve[Solve ECOS/SCS] | |
| cvx_solve --> target_weights[Target Asset Weights] | |
| end | |
| %% Execution & Simulation | |
| subgraph Execution ["Execution & Backtesting"] | |
| target_weights --> hifo[HIFO Lot Manager] | |
| target_weights --> exec_cost[Almgren-Chriss Impact] | |
| hifo --> tax[Tax Drag Calculation] | |
| exec_cost --> net_curve[Net Equity Curve] | |
| tax --> net_curve | |
| end | |
| %% Post Trade Reporting | |
| subgraph Reporting ["Reporting & Analytics Builders"] | |
| target_weights --> mc[Monte Carlo Simulation] | |
| target_weights --> mvar[Marginal VaR] | |
| target_weights --> cvar[Component CVaR] | |
| net_curve --> perf_builder[html_performance.py] | |
| mc --> risk_builder[html_risk.py] | |
| cvar --> risk_builder | |
| mvar --> risk_builder | |
| hifo --> tax_builder[html_tax.py] | |
| perf_builder --> report_orchestrator(report_data.py) | |
| risk_builder --> report_orchestrator | |
| tax_builder --> report_orchestrator | |
| end | |
| %% Connections between major blocks | |
| returns --> Solver Engine | |
| target_weights --> Reporting | |
| net_curve --> Reporting | |
| ``` | |
| ## The "cfg" Dictionary and EngineConfig | |
| Historically, the engine passed a mutable `cfg` dictionary throughout the entire codebase. This is being replaced by the `EngineConfig` dataclass (defined in `core_types.py`). `EngineConfig` ensures all mathematical parameters (e.g. `garch_enabled`, `cvar_alpha`, `max_turnover`) are strictly typed and immutable during the optimization step. | |
| ## Circular Dependency Resolution | |
| The core analytical dependency chain follows a strict unidirectional flow to avoid circular imports: | |
| `utils.metrics` ← `analytics.py` ← `solver.py` ← `report_data.py` (which orchestrates the HTML builders inside `report_builders/`). | |