Premchan369 commited on
Commit
e91d63c
Β·
verified Β·
1 Parent(s): 1029f6e

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -138
README.md CHANGED
@@ -1,154 +1,163 @@
1
- # 🏦 AlphaForge - Autonomous Quant Fund OS
2
 
3
- A **production-grade** quantitative trading system with 22 integrated modules β€” multi-source alpha generation, risk management, portfolio optimization, and derivatives pricing.
4
 
5
- ## 🎯 Why This Is Different
6
 
7
- | Feature | Typical Student Project | AlphaForge |
8
- |---------|------------------------|------------|
9
- | Alpha Sources | Only price data | Price + Sentiment + Factors |
10
- | Model Architecture | Single model | Ensemble + Meta-Model |
11
- | Risk | Ignored | VaR, CVaR, Tail Risk, Drawdown Control |
12
- | Regime | Assumed constant | HMM Regime Switching |
13
- | Uncertainty | Point estimates | Bayesian Probabilistic |
14
- | Options | None | ML Pricing + Mispricing |
15
- | Hedging | None | Delta-Neutral Dynamic |
16
- | Explainability | None | SHAP + Feature Importance |
17
- | Anomaly Detection | None | Isolation Forest + Autoencoder |
18
- | Stress Testing | None | 2008, COVID, Flash Crash, etc. |
19
- | Online Learning | Static | Adaptive with Drift Detection |
20
- | Dashboard | Console print | Gradio Live Dashboard |
21
 
22
- ## πŸ— Architecture
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ```
25
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
26
- β”‚ A L P H A F O R G E β”‚
27
- β”‚ Autonomous Quant Fund OS β”‚
28
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
29
- β”‚ DATA LAYER β”‚ MODEL LAYER β”‚ EXECUTION LAYER β”‚
30
- β”‚ β”‚ β”‚ β”‚
31
- β”‚ Market Data β”‚ Alpha Ensemble β”‚ Portfolio Optimizer β”‚
32
- β”‚ (OHLCV) β”‚ (LSTM+Trans+ β”‚ (Mean-Var, Max Sharpe,β”‚
33
- β”‚ β”‚ β”‚ XGBoost) β”‚ Robust, Black-Lit) β”‚
34
- β”‚ β–Ό β”‚ β”‚ β”‚ β”‚ β”‚
35
- β”‚ Technical β”‚ Sentiment Model β”‚ Backtest Engine β”‚
36
- β”‚ Indicators β”‚ (FinBERT) β”‚ (PnL, Sharpe, Metrics)β”‚
37
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
38
- β”‚ β–Ό β”‚ β–Ό β”‚ β–Ό β”‚
39
- β”‚ Cross-Asset β”‚ Meta-Model β”‚ Strategy Ensemble β”‚
40
- β”‚ Features β”‚ (Signal Weights) β”‚ (Dynamic Allocation) β”‚
41
- β”‚ β”‚ β”‚ β”‚
42
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
43
- β”‚ RISK LAYER β”‚ ADVANCED ML β”‚ MONITORING LAYER β”‚
44
- β”‚ β”‚ β”‚ β”‚
45
- β”‚ Regime Detect β”‚ Online Learning β”‚ Live Dashboard β”‚
46
- β”‚ (HMM) β”‚ (Adaptive AI) β”‚ (Gradio) β”‚
47
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
48
- β”‚ β–Ό β”‚ β–Ό β”‚ β–Ό β”‚
49
- β”‚ Risk Engine β”‚ Explainability β”‚ Factor Attribution β”‚
50
- β”‚ (VaR/CVaR) β”‚ (SHAP) β”‚ (Decomposition) β”‚
51
- β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
52
- β”‚ β–Ό β”‚ β–Ό β”‚ β–Ό β”‚
53
- β”‚ Stress Test β”‚ Anomaly Detect β”‚ Hedging Engine β”‚
54
- β”‚ Engine β”‚ (IsoForest+AE) β”‚ (Options-based) β”‚
55
- β”‚ β”‚ β”‚ β”‚
56
- β”‚ Drawdown Ctrl β”‚ Bayesian Layer β”‚ Options Pricer β”‚
57
- β”‚ (Scaling) β”‚ (Uncertainty) β”‚ (NN + Mispricing) β”‚
58
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
59
  ```
60
 
61
- ## πŸ“¦ Modules (22 files)
62
-
63
- ### Core Pipeline (Papers-backed)
64
- | # | Module | File | Description |
65
- |---|--------|------|-------------|
66
- | 1 | **Alpha Model** | `alpha_model.py` | LSTM + Transformer + XGBoost ensemble |
67
- | 2 | **Sentiment** | `sentiment_model.py` | FinBERT pipeline (`ProsusAI/finbert`) |
68
- | 3 | **Volatility** | `volatility_model.py` | GARCH(1,1) + LSTM with skewed-t distribution |
69
- | 4 | **Portfolio** | `portfolio_optimizer.py` | MV, Max Sharpe, Min Vol, Robust, Black-Litterman |
70
- | 5 | **Options** | `options_pricer.py` | 4-layer NN pricing + IV + mispricing signals |
71
- | 6 | **Backtest** | `backtest_engine.py` | Full metrics (Sharpe, Sortino, Calmar, IC) |
72
- | 7 | **Data** | `market_data.py` | Fetch + 20+ technical indicators |
73
-
74
- ### πŸ”₯ High-Value Additions
75
- | # | Module | File | Description |
76
- |---|--------|------|-------------|
77
- | 8 | **Meta-Model** | `meta_model.py` | Learns which model to trust (Renaissance-style) |
78
- | 9 | **Regime HMM** | `regime_detector.py` | Hidden Markov Model + strategy switching |
79
- | 10 | **Risk Engine** | `risk_engine.py` | VaR (historical/parametric/Cornish-Fisher), CVaR, tail |
80
- | 11 | **Factor Decomp** | `factor_decomposition.py` | Momentum, Value, Size, Vol, Quality factors |
81
- | 12 | **Online Learning** | `online_learning.py` | Incremental SGD + concept drift detection |
82
- | 13 | **Explainability** | `explainability.py` | SHAP-style feature importance |
83
- | 14 | **Anomaly Det.** | `anomaly_detector.py` | Isolation Forest + Autoencoder |
84
- | 15 | **Stress Test** | `stress_test.py` | 2008, COVID, Flash Crash, Monte Carlo |
85
- | 16 | **Bayesian** | `bayesian_layer.py` | Probabilistic forecasts + shrinkage |
86
- | 17 | **Hedging** | `hedging_engine.py` | Delta-neutral dynamic hedging |
87
- | 18 | **Strategy Ensemble** | `strategy_ensemble.py` | Multi-strategy capital allocation |
88
- | 19 | **Drawdown Ctrl** | `risk_engine.py` | Adaptive position scaling |
89
- | 20 | **Orchestrator** | `main.py` | Wires everything together |
90
- | 21 | **Original** | `main_original.py` | Original modular main script |
91
-
92
- ## πŸš€ Quick Start
93
 
94
  ```bash
95
  git clone https://huggingface.co/Premchan369/alphaforge-quant-system
96
  cd alphaforge-quant-system
97
  pip install -r requirements.txt
98
 
99
- # Run the full pipeline
100
- python main.py --tickers SPY QQQ AAPL MSFT GOOGL AMZN META NVDA TSLA JPM \
101
- --start 2020-01-01 --end 2024-01-01 \
102
- --epochs 30 --capital 1000000
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  ```
104
 
105
- ## πŸ“Š Dashboard
106
-
107
- Live monitoring dashboard at: **https://huggingface.co/spaces/Premchan369/alphaforge-dashboard**
108
-
109
- Features:
110
- - Portfolio equity curve with drawdown
111
- - Live PnL and risk metrics
112
- - Covariance matrix heatmap
113
- - Factor exposure breakdown
114
- - Options IV surface
115
- - Anomaly detection tracker
116
- - Model performance comparison
117
- - Configuration panel
118
-
119
- ## πŸ“ˆ Key Metrics Tracked
120
-
121
- | Metric | Description | Where |
122
- |--------|-------------|-------|
123
- | Sharpe Ratio | Risk-adjusted return | Backtest |
124
- | Sortino Ratio | Downside risk-adjusted | Backtest |
125
- | Max Drawdown | Peak-to-trough decline | Risk Engine |
126
- | Calmar Ratio | Return / Max DD | Backtest |
127
- | VaR (95%, 99%) | Value at Risk | Risk Engine |
128
- | CVaR | Conditional VaR | Risk Engine |
129
- | IC | Information Coefficient | Alpha Model |
130
- | Alpha / Beta | Market-adjusted metrics | Factor Decomp |
131
- | Factor Exposures | Style factor loadings | Factor Decomp |
132
- | Win Rate | % of profitable days | Backtest |
133
- | Hedge Ratio | Delta hedge level | Hedging |
134
-
135
- ## 🧠 Research Backing
136
-
137
- - **Alpha**: xLSTM-TS (Lopez Gil et al., 2024) + QuantaAlpha (Han et al., 2026)
138
- - **Sentiment**: FinBERT (Araci, 2019), ChatGPT benchmarking (Fatouros et al., 2023)
139
- - **Volatility**: LSTM-SSTD (Michankow, 2025)
140
- - **Portfolio**: MTL-TSMOM (Ong & Herremans, 2023)
141
- - **Options**: Feed-Forward NN (Berger et al., 2023), PINN (Dhiman et al., 2023)
142
-
143
- ## πŸ† Real-World Alignment
144
-
145
- This system mirrors production architectures used by:
146
- - **Renaissance Technologies**: Multi-signal meta-model
147
- - **Two Sigma**: ML-driven alpha + risk decomposition
148
- - **Bridgewater**: Regime-based allocation
149
- - **AQR**: Factor decomposition + systematic strategies
150
- - **Citadel**: Options pricing + delta hedging
151
-
152
- ## πŸ“„ License
153
-
154
- MIT
 
 
 
 
 
 
 
 
1
+ # AlphaForge - Multi-Asset Quantitative Trading System v2.0
2
 
3
+ A comprehensive quantitative trading system combining real-time data streaming, advanced feature engineering, multi-model alpha signals, sentiment analysis, volatility forecasting, portfolio optimization, and ML options pricing.
4
 
5
+ ## What's New in v2.0
6
 
7
+ ### Real-Time Data
8
+ - **Alpaca Markets** WebSocket streaming (free tier, real-time IEX)
9
+ - **Polygon.io** professional WebSocket (NBBO, trades, aggregates)
10
+ - **Yahoo Finance** polling (free, 15-min delayed)
11
+ - **FRED macro data** (yield curve, VIX, credit spreads)
12
+ - **Live news streaming** with FinBERT sentiment processing
13
+ - **Order flow estimation** from tick data
 
 
 
 
 
 
 
14
 
15
+ ### Advanced Feature Engineering (90+ Features)
16
+ - **Microstructure**: Amihud illiquidity, Kyle's lambda, bid-ask spread proxy, VWAP, Roll spread
17
+ - **Cross-sectional**: Momentum ranking, mean reversion, return dispersion
18
+ - **Macro overlay**: Yield curve (10Y-2Y spread, inversion), VIX regime, credit spreads
19
+ - **Stat-arb**: Cointegration spread, half-life, relative value
20
+ - **Regime detection**: Volatility regime, trend regime, liquidity regime
21
+ - **Advanced technicals**: Ichimoku, Supertrend, Keltner channels, Volume profile
22
+
23
+ ### Online Learning
24
+ - **Drift detection**: Kolmogorov-Smirnov test, CUSUM change point
25
+ - **Adaptive retraining**: Automatic model update when drift detected
26
+ - **IC tracking**: Real-time information coefficient monitoring
27
+
28
+ ## Architecture
29
 
30
  ```
31
+ Real-Time Data Feeds (Alpaca/Polygon/Yahoo)
32
+ |
33
+ β”œβ”€β”€β–Ί Advanced Feature Engine (90+ features)
34
+ | β”œβ”€β”€ Microstructure (bid-ask, Kyle lambda, VWAP)
35
+ | β”œβ”€β”€ Cross-Sectional (momentum, dispersion)
36
+ | β”œβ”€β”€ Macro Overlay (VIX, yield curve, credit)
37
+ | β”œβ”€β”€ Regime Detection (vol/trend/liquidity)
38
+ | └── Advanced Technicals (Ichimoku, Supertrend)
39
+ |
40
+ β”œβ”€β”€β–Ί News Stream ──► FinBERT ──► Sentiment Alpha (S_t)
41
+ |
42
+ β”œβ”€β”€β–Ί Alpha Model (LSTM + Transformer + XGBoost Ensemble)
43
+ | └──► Combined Alpha = w1*Price Alpha + w2*Sentiment Alpha
44
+ |
45
+ β”œβ”€β”€β–Ί Volatility Engine (GARCH + LSTM) ──► Covariance (Ξ£)
46
+ |
47
+ β”œβ”€β”€β–Ί Portfolio Optimizer
48
+ | β”œβ”€β”€ Max Sharpe
49
+ | β”œβ”€β”€ Min Volatility
50
+ | β”œβ”€β”€ Robust Optimization
51
+ | └── Black-Litterman
52
+ |
53
+ └──► Backtest Engine ──► PnL, Sharpe, Sortino, Max DD
 
 
 
 
 
 
 
 
 
 
 
54
  ```
55
 
56
+ ## Installation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
  ```bash
59
  git clone https://huggingface.co/Premchan369/alphaforge-quant-system
60
  cd alphaforge-quant-system
61
  pip install -r requirements.txt
62
 
63
+ # Optional: For FRED macro data
64
+ export FRED_API_KEY=your_key_here
65
+
66
+ # Optional: For Alpaca real-time streaming
67
+ export ALPACA_API_KEY=your_key_here
68
+ export ALPACA_SECRET_KEY=your_secret_here
69
+ ```
70
+
71
+ ## Usage
72
+
73
+ ### Full Backtest with Advanced Features
74
+ ```bash
75
+ # Standard backtest
76
+ python main.py --mode backtest --start 2020-01-01 --end 2024-01-01
77
+
78
+ # With advanced features + macro + sentiment + online learning
79
+ python main.py --mode backtest --start 2020-01-01 --end 2024-01-01 \
80
+ --advanced-features --include-macro --include-sentiment --online-learning
81
+ ```
82
+
83
+ ### Real-Time Streaming
84
+ ```bash
85
+ # Yahoo Finance (free, 15-min delayed)
86
+ python main.py --mode realtime --source yahoo --tickers SPY QQQ AAPL MSFT
87
+
88
+ # Alpaca (free tier, real-time IEX)
89
+ python main.py --mode realtime --source alpaca \
90
+ --api-key YOUR_KEY --secret-key YOUR_SECRET
91
+
92
+ # Polygon.io (professional)
93
+ python main.py --mode realtime --source polygon --api-key YOUR_KEY
94
+ ```
95
+
96
+ ### Train Model
97
+ ```bash
98
+ python main.py --mode train --tickers SPY QQQ AAPL MSFT GOOGL AMZN META NVDA TSLA JPM \
99
+ --epochs 50 --advanced-features --include-macro
100
+ ```
101
+
102
+ ### Options Pricing
103
+ ```bash
104
+ python main.py --mode options
105
  ```
106
 
107
+ ## File Structure
108
+
109
+ | File | Description |
110
+ |------|-------------|
111
+ | `main.py` | Entry point - train, backtest, or real-time mode |
112
+ | `market_data.py` | OHLCV data fetching + basic features (RSI, MACD, BB) |
113
+ | `alpha_model.py` | LSTM/Transformer/XGBoost ensemble with IC tracking |
114
+ | `sentiment_model.py` | FinBERT sentiment with batch processing |
115
+ | `volatility_model.py` | GARCH(1,1) + LSTM volatility forecasting |
116
+ | `portfolio_optimizer.py` | Mean-variance, max-Sharpe, robust, Black-Litterman |
117
+ | `options_pricer.py` | ML options pricing + mispricing detection |
118
+ | `backtest_engine.py` | Full backtest with Sharpe, Sortino, max DD, regime detection |
119
+ | `advanced_features_part1.py` | Microstructure + cross-sectional features |
120
+ | `macro_features.py` | FRED macro + yield curve + VIX + credit spreads |
121
+ | `regime_features.py` | Volatility/trend/liquidity regime detection |
122
+ | `technical_indicators.py` | Ichimoku, Supertrend, Keltner, Volume Profile |
123
+ | `stat_arb_features.py` | Cointegration, spread, relative value, half-life |
124
+ | `online_learning.py` | Drift detection (KS, CUSUM) + adaptive retraining |
125
+ | `realtime_data.py` | Alpaca/Polygon/Yahoo streaming + news + order flow |
126
+
127
+ ## Data Sources
128
+
129
+ | Source | Type | Cost | Real-Time |
130
+ |--------|------|------|-----------|
131
+ | **Yahoo Finance** | OHLCV + News | Free | 15min delayed |
132
+ | **Alpaca Markets** | Trades + Bars | Free tier | Real-time (IEX) |
133
+ | **Polygon.io** | NBBO + Trades + Aggs | Paid | Real-time |
134
+ | **FRED** | Macro (rates, VIX) | Free | Daily |
135
+ | **FMP** | News + Financials | Free tier | Daily |
136
+ | **FinBERT** | Sentiment | Free (local) | Batch |
137
+
138
+ ## Metrics
139
+
140
+ | Metric | Description |
141
+ |--------|-------------|
142
+ | **IC** | Information Coefficient (rank correlation predicted vs actual) |
143
+ | **IC IR** | IC Information Ratio (mean IC / std IC) |
144
+ | **Sharpe** | Risk-adjusted return (excess return / volatility) |
145
+ | **Sortino** | Downside risk-adjusted return |
146
+ | **Max DD** | Maximum peak-to-trough decline |
147
+ | **Calmar** | Annualized return / max drawdown |
148
+ | **Alpha/Beta** | Excess return and market sensitivity |
149
+ | **Turnover** | Portfolio rebalance intensity |
150
+
151
+ ## Research Backing
152
+
153
+ - **Alpha Models**: xLSTM-TS with wavelet denoising (Lopez Gil et al., 2024)
154
+ - **Sentiment**: FinBERT (Araci, 2019) with ChatGPT benchmarking
155
+ - **Volatility**: LSTM with skewed Student's t (MichaΕ„kow, 2025)
156
+ - **Portfolio**: Multi-task learning joint optimization (Ong & Herremans, 2023)
157
+ - **Options**: 5-layer FNN outperforming Black-Scholes (Berger et al., 2023)
158
+ - **Microstructure**: Amihud (2002), Kyle (1985), Corwin-Schultz (2012), Roll (1984)
159
+ - **Online Learning**: CUSUM change detection, KS drift test
160
+
161
+ ## License
162
+
163
+ MIT