Premchan369 commited on
Commit
de8787e
Β·
verified Β·
1 Parent(s): 90b70c8

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +142 -137
README.md CHANGED
@@ -1,149 +1,154 @@
1
- # AlphaForge - Multi-Asset Quantitative Trading System
2
-
3
- A comprehensive quantitative trading system that combines price-based alpha signals, financial sentiment analysis, volatility forecasting, portfolio optimization, and ML-based options pricing.
4
-
5
- ## Features
6
-
7
- ### 1. Multi-Asset Alpha Model
8
- - **LSTM** neural network for sequential pattern recognition
9
- - **Transformer** architecture for attention-based forecasting
10
- - **XGBoost** ensemble for robust feature-based predictions
11
- - **Ensemble** combining all three with IC-weighted blending
12
- - **IC Tracking**: Information Coefficient monitoring over time
13
- - **Feature Drift Detection**: XGBoost importance divergence tracking
14
-
15
- ### 2. News + Sentiment Alpha (FinBERT)
16
- - Uses `ProsusAI/finbert` for financial sentiment analysis
17
- - Converts news/social media into numerical alpha signals
18
- - Confidence-weighted aggregation per asset per day
19
- - Synthetic news generation for testing
20
-
21
- ### 3. Volatility Forecasting Engine
22
- - **GARCH(1,1)** with Student-t errors for baseline
23
- - **LSTM** with skewed Student's t distributional output
24
- - **EWMA covariance** matrix construction
25
- - Positive definite enforcement
26
-
27
- ### 4. Portfolio Optimizer
28
- - Mean-variance optimization with transaction costs
29
- - Max Sharpe ratio optimization
30
- - Minimum volatility with return constraints
31
- - **Robust optimization** with uncertainty sets
32
- - **Black-Litterman** model for incorporating views
33
- - Efficient frontier computation
34
-
35
- ### 5. Options Pricing with ML
36
- - 4-layer neural network (256-128-64-32)
37
- - Black-Scholes baseline for comparison
38
- - Implied volatility prediction
39
- - **Mispricing detection** for arbitrage signals
40
- - Synthetic data generation for training
41
-
42
- ### 6. Backtest Engine
43
- - Transaction cost and slippage simulation
44
- - Comprehensive metrics:
45
- - Sharpe, Sortino, Calmar ratios
46
- - Max drawdown, win rate
47
- - Alpha, Beta, Information Ratio
48
- - Turnover and cost analysis
49
- - Regime detection (bull/bear/high-vol)
50
- - Rolling performance metrics
51
-
52
- ## Installation
53
 
54
- ```bash
55
- git clone https://huggingface.co/Premchan369/alphaforge-quant-system
56
- cd alphaforge-quant-system
57
- pip install -r requirements.txt
58
- ```
59
 
60
- ## Usage
61
 
62
- ### Train Alpha Model
63
- ```bash
64
- python main.py --mode train --tickers SPY QQQ AAPL MSFT --epochs 50
65
- ```
 
 
 
 
 
 
 
 
 
 
66
 
67
- ### Run Full Backtest
68
- ```bash
69
- python main.py --mode backtest --start 2020-01-01 --end 2024-01-01
70
- ```
71
 
72
- ### Train Options Model
73
- ```bash
74
- python main.py --mode options
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
  ```
76
 
77
- ## Pipeline Architecture
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
- ```
80
- Market Data (OHLCV)
81
- |
82
- +---> Technical Indicators (RSI, MACD, Bollinger, etc.)
83
- +---> Cross-Asset Features (beta, correlation, spreads)
84
- |
85
- v
86
- Alpha Model (LSTM + Transformer + XGBoost Ensemble)
87
- |---> Predicted Returns (mu)
88
- |---> IC Tracking
89
- |
90
- News Data
91
- |
92
- v
93
- Sentiment Model (FinBERT)
94
- |---> Sentiment Alpha (S_t)
95
- |
96
- v
97
- Combined Alpha = w1 * Price Alpha + w2 * Sentiment Alpha
98
-
99
- Market Data
100
- |
101
- v
102
- Volatility Engine (GARCH + LSTM)
103
- |---> Covariance Matrix (Sigma)
104
- |
105
- v
106
- Portfolio Optimizer (Mean-Variance / Max Sharpe / Robust)
107
- |---> Optimal Weights (w)
108
- |
109
- v
110
- Backtest Engine
111
- |---> PnL, Sharpe, Drawdown, etc.
112
  ```
113
 
114
- ## Key Metrics
115
-
116
- | Metric | Description |
117
- |--------|-------------|
118
- | **IC** | Information Coefficient (rank correlation between predicted and actual returns) |
119
- | **Sharpe** | Risk-adjusted return (excess return / volatility) |
120
- | **Sortino** | Downside risk-adjusted return |
121
- | **Max DD** | Maximum peak-to-trough decline |
122
- | **Calmar** | Annualized return / max drawdown |
123
- | **Alpha** | Excess return vs benchmark |
124
- | **Beta** | Market sensitivity |
125
-
126
- ## File Structure
127
-
128
- | File | Description |
129
- |------|-------------|
130
- | `main.py` | Entry point and orchestration |
131
- | `market_data.py` | Data fetching and feature engineering |
132
- | `alpha_model.py` | LSTM/Transformer/XGBoost ensemble |
133
- | `sentiment_model.py` | FinBERT sentiment analysis |
134
- | `volatility_model.py` | GARCH + LSTM volatility forecasting |
135
- | `portfolio_optimizer.py` | Mean-variance and robust optimization |
136
- | `options_pricer.py` | ML options pricing and mispricing detection |
137
- | `backtest_engine.py` | Backtesting with comprehensive metrics |
138
-
139
- ## Research Backing
140
-
141
- - **Alpha Models**: xLSTM-TS with wavelet denoising (Lopez Gil et al., 2024)
142
- - **Sentiment**: FinBERT (Araci, 2019) with ChatGPT benchmarking (Fatouros et al., 2023)
143
- - **Volatility**: LSTM with skewed Student's t (Michankow, 2025)
144
- - **Portfolio**: Multi-task learning for joint optimization (Ong & Herremans, 2023)
145
- - **Options**: 5-layer FNN outperforming Black-Scholes (Berger et al., 2023)
146
-
147
- ## License
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
 
149
  MIT
 
1
+ # 🏦 AlphaForge - Autonomous Quant Fund OS
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
+ A **production-grade** quantitative trading system with 22 integrated modules β€” multi-source alpha generation, risk management, portfolio optimization, and derivatives pricing.
 
 
 
 
4
 
5
+ ## 🎯 Why This Is Different
6
 
7
+ | Feature | Typical Student Project | AlphaForge |
8
+ |---------|------------------------|------------|
9
+ | Alpha Sources | Only price data | Price + Sentiment + Factors |
10
+ | Model Architecture | Single model | Ensemble + Meta-Model |
11
+ | Risk | Ignored | VaR, CVaR, Tail Risk, Drawdown Control |
12
+ | Regime | Assumed constant | HMM Regime Switching |
13
+ | Uncertainty | Point estimates | Bayesian Probabilistic |
14
+ | Options | None | ML Pricing + Mispricing |
15
+ | Hedging | None | Delta-Neutral Dynamic |
16
+ | Explainability | None | SHAP + Feature Importance |
17
+ | Anomaly Detection | None | Isolation Forest + Autoencoder |
18
+ | Stress Testing | None | 2008, COVID, Flash Crash, etc. |
19
+ | Online Learning | Static | Adaptive with Drift Detection |
20
+ | Dashboard | Console print | Gradio Live Dashboard |
21
 
22
+ ## πŸ— Architecture
 
 
 
23
 
24
+ ```
25
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
26
+ β”‚ A L P H A F O R G E β”‚
27
+ β”‚ Autonomous Quant Fund OS β”‚
28
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
29
+ β”‚ DATA LAYER β”‚ MODEL LAYER β”‚ EXECUTION LAYER β”‚
30
+ β”‚ β”‚ β”‚ β”‚
31
+ β”‚ Market Data β”‚ Alpha Ensemble β”‚ Portfolio Optimizer β”‚
32
+ β”‚ (OHLCV) β”‚ (LSTM+Trans+ β”‚ (Mean-Var, Max Sharpe,β”‚
33
+ β”‚ β”‚ β”‚ XGBoost) β”‚ Robust, Black-Lit) β”‚
34
+ β”‚ β–Ό β”‚ β”‚ β”‚ β”‚ β”‚
35
+ β”‚ Technical β”‚ Sentiment Model β”‚ Backtest Engine β”‚
36
+ β”‚ Indicators β”‚ (FinBERT) β”‚ (PnL, Sharpe, Metrics)β”‚
37
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
38
+ β”‚ β–Ό β”‚ β–Ό β”‚ β–Ό β”‚
39
+ β”‚ Cross-Asset β”‚ Meta-Model β”‚ Strategy Ensemble β”‚
40
+ β”‚ Features β”‚ (Signal Weights) β”‚ (Dynamic Allocation) β”‚
41
+ β”‚ β”‚ β”‚ β”‚
42
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
43
+ β”‚ RISK LAYER β”‚ ADVANCED ML β”‚ MONITORING LAYER β”‚
44
+ β”‚ β”‚ β”‚ β”‚
45
+ β”‚ Regime Detect β”‚ Online Learning β”‚ Live Dashboard β”‚
46
+ β”‚ (HMM) β”‚ (Adaptive AI) β”‚ (Gradio) β”‚
47
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
48
+ β”‚ β–Ό β”‚ β–Ό β”‚ β–Ό β”‚
49
+ β”‚ Risk Engine β”‚ Explainability β”‚ Factor Attribution β”‚
50
+ β”‚ (VaR/CVaR) β”‚ (SHAP) β”‚ (Decomposition) β”‚
51
+ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
52
+ β”‚ β–Ό β”‚ β–Ό β”‚ β–Ό β”‚
53
+ β”‚ Stress Test β”‚ Anomaly Detect β”‚ Hedging Engine β”‚
54
+ β”‚ Engine β”‚ (IsoForest+AE) β”‚ (Options-based) β”‚
55
+ β”‚ β”‚ β”‚ β”‚
56
+ β”‚ Drawdown Ctrl β”‚ Bayesian Layer β”‚ Options Pricer β”‚
57
+ β”‚ (Scaling) β”‚ (Uncertainty) β”‚ (NN + Mispricing) β”‚
58
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
59
  ```
60
 
61
+ ## πŸ“¦ Modules (22 files)
62
+
63
+ ### Core Pipeline (Papers-backed)
64
+ | # | Module | File | Description |
65
+ |---|--------|------|-------------|
66
+ | 1 | **Alpha Model** | `alpha_model.py` | LSTM + Transformer + XGBoost ensemble |
67
+ | 2 | **Sentiment** | `sentiment_model.py` | FinBERT pipeline (`ProsusAI/finbert`) |
68
+ | 3 | **Volatility** | `volatility_model.py` | GARCH(1,1) + LSTM with skewed-t distribution |
69
+ | 4 | **Portfolio** | `portfolio_optimizer.py` | MV, Max Sharpe, Min Vol, Robust, Black-Litterman |
70
+ | 5 | **Options** | `options_pricer.py` | 4-layer NN pricing + IV + mispricing signals |
71
+ | 6 | **Backtest** | `backtest_engine.py` | Full metrics (Sharpe, Sortino, Calmar, IC) |
72
+ | 7 | **Data** | `market_data.py` | Fetch + 20+ technical indicators |
73
+
74
+ ### πŸ”₯ High-Value Additions
75
+ | # | Module | File | Description |
76
+ |---|--------|------|-------------|
77
+ | 8 | **Meta-Model** | `meta_model.py` | Learns which model to trust (Renaissance-style) |
78
+ | 9 | **Regime HMM** | `regime_detector.py` | Hidden Markov Model + strategy switching |
79
+ | 10 | **Risk Engine** | `risk_engine.py` | VaR (historical/parametric/Cornish-Fisher), CVaR, tail |
80
+ | 11 | **Factor Decomp** | `factor_decomposition.py` | Momentum, Value, Size, Vol, Quality factors |
81
+ | 12 | **Online Learning** | `online_learning.py` | Incremental SGD + concept drift detection |
82
+ | 13 | **Explainability** | `explainability.py` | SHAP-style feature importance |
83
+ | 14 | **Anomaly Det.** | `anomaly_detector.py` | Isolation Forest + Autoencoder |
84
+ | 15 | **Stress Test** | `stress_test.py` | 2008, COVID, Flash Crash, Monte Carlo |
85
+ | 16 | **Bayesian** | `bayesian_layer.py` | Probabilistic forecasts + shrinkage |
86
+ | 17 | **Hedging** | `hedging_engine.py` | Delta-neutral dynamic hedging |
87
+ | 18 | **Strategy Ensemble** | `strategy_ensemble.py` | Multi-strategy capital allocation |
88
+ | 19 | **Drawdown Ctrl** | `risk_engine.py` | Adaptive position scaling |
89
+ | 20 | **Orchestrator** | `main.py` | Wires everything together |
90
+ | 21 | **Original** | `main_original.py` | Original modular main script |
91
+
92
+ ## πŸš€ Quick Start
93
 
94
+ ```bash
95
+ git clone https://huggingface.co/Premchan369/alphaforge-quant-system
96
+ cd alphaforge-quant-system
97
+ pip install -r requirements.txt
98
+
99
+ # Run the full pipeline
100
+ python main.py --tickers SPY QQQ AAPL MSFT GOOGL AMZN META NVDA TSLA JPM \
101
+ --start 2020-01-01 --end 2024-01-01 \
102
+ --epochs 30 --capital 1000000
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  ```
104
 
105
+ ## πŸ“Š Dashboard
106
+
107
+ Live monitoring dashboard at: **https://huggingface.co/spaces/Premchan369/alphaforge-dashboard**
108
+
109
+ Features:
110
+ - Portfolio equity curve with drawdown
111
+ - Live PnL and risk metrics
112
+ - Covariance matrix heatmap
113
+ - Factor exposure breakdown
114
+ - Options IV surface
115
+ - Anomaly detection tracker
116
+ - Model performance comparison
117
+ - Configuration panel
118
+
119
+ ## πŸ“ˆ Key Metrics Tracked
120
+
121
+ | Metric | Description | Where |
122
+ |--------|-------------|-------|
123
+ | Sharpe Ratio | Risk-adjusted return | Backtest |
124
+ | Sortino Ratio | Downside risk-adjusted | Backtest |
125
+ | Max Drawdown | Peak-to-trough decline | Risk Engine |
126
+ | Calmar Ratio | Return / Max DD | Backtest |
127
+ | VaR (95%, 99%) | Value at Risk | Risk Engine |
128
+ | CVaR | Conditional VaR | Risk Engine |
129
+ | IC | Information Coefficient | Alpha Model |
130
+ | Alpha / Beta | Market-adjusted metrics | Factor Decomp |
131
+ | Factor Exposures | Style factor loadings | Factor Decomp |
132
+ | Win Rate | % of profitable days | Backtest |
133
+ | Hedge Ratio | Delta hedge level | Hedging |
134
+
135
+ ## 🧠 Research Backing
136
+
137
+ - **Alpha**: xLSTM-TS (Lopez Gil et al., 2024) + QuantaAlpha (Han et al., 2026)
138
+ - **Sentiment**: FinBERT (Araci, 2019), ChatGPT benchmarking (Fatouros et al., 2023)
139
+ - **Volatility**: LSTM-SSTD (Michankow, 2025)
140
+ - **Portfolio**: MTL-TSMOM (Ong & Herremans, 2023)
141
+ - **Options**: Feed-Forward NN (Berger et al., 2023), PINN (Dhiman et al., 2023)
142
+
143
+ ## πŸ† Real-World Alignment
144
+
145
+ This system mirrors production architectures used by:
146
+ - **Renaissance Technologies**: Multi-signal meta-model
147
+ - **Two Sigma**: ML-driven alpha + risk decomposition
148
+ - **Bridgewater**: Regime-based allocation
149
+ - **AQR**: Factor decomposition + systematic strategies
150
+ - **Citadel**: Options pricing + delta hedging
151
+
152
+ ## πŸ“„ License
153
 
154
  MIT