Upload README.md
Browse files
README.md
CHANGED
|
@@ -1,154 +1,163 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
-
A
|
| 4 |
|
| 5 |
-
##
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
| Options | None | ML Pricing + Mispricing |
|
| 15 |
-
| Hedging | None | Delta-Neutral Dynamic |
|
| 16 |
-
| Explainability | None | SHAP + Feature Importance |
|
| 17 |
-
| Anomaly Detection | None | Isolation Forest + Autoencoder |
|
| 18 |
-
| Stress Testing | None | 2008, COVID, Flash Crash, etc. |
|
| 19 |
-
| Online Learning | Static | Adaptive with Drift Detection |
|
| 20 |
-
| Dashboard | Console print | Gradio Live Dashboard |
|
| 21 |
|
| 22 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
```
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
βββ
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
βββ
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
β βΌ β βΌ β βΌ β
|
| 49 |
-
β Risk Engine β Explainability β Factor Attribution β
|
| 50 |
-
β (VaR/CVaR) β (SHAP) β (Decomposition) β
|
| 51 |
-
β β β β β β β
|
| 52 |
-
β βΌ β βΌ β βΌ β
|
| 53 |
-
β Stress Test β Anomaly Detect β Hedging Engine β
|
| 54 |
-
β Engine β (IsoForest+AE) β (Options-based) β
|
| 55 |
-
β β β β
|
| 56 |
-
β Drawdown Ctrl β Bayesian Layer β Options Pricer β
|
| 57 |
-
β (Scaling) β (Uncertainty) β (NN + Mispricing) β
|
| 58 |
-
βββββββββββββββββββ΄βββββββββββββββββββ΄βββββββββββββββββββββββββ
|
| 59 |
```
|
| 60 |
|
| 61 |
-
##
|
| 62 |
-
|
| 63 |
-
### Core Pipeline (Papers-backed)
|
| 64 |
-
| # | Module | File | Description |
|
| 65 |
-
|---|--------|------|-------------|
|
| 66 |
-
| 1 | **Alpha Model** | `alpha_model.py` | LSTM + Transformer + XGBoost ensemble |
|
| 67 |
-
| 2 | **Sentiment** | `sentiment_model.py` | FinBERT pipeline (`ProsusAI/finbert`) |
|
| 68 |
-
| 3 | **Volatility** | `volatility_model.py` | GARCH(1,1) + LSTM with skewed-t distribution |
|
| 69 |
-
| 4 | **Portfolio** | `portfolio_optimizer.py` | MV, Max Sharpe, Min Vol, Robust, Black-Litterman |
|
| 70 |
-
| 5 | **Options** | `options_pricer.py` | 4-layer NN pricing + IV + mispricing signals |
|
| 71 |
-
| 6 | **Backtest** | `backtest_engine.py` | Full metrics (Sharpe, Sortino, Calmar, IC) |
|
| 72 |
-
| 7 | **Data** | `market_data.py` | Fetch + 20+ technical indicators |
|
| 73 |
-
|
| 74 |
-
### π₯ High-Value Additions
|
| 75 |
-
| # | Module | File | Description |
|
| 76 |
-
|---|--------|------|-------------|
|
| 77 |
-
| 8 | **Meta-Model** | `meta_model.py` | Learns which model to trust (Renaissance-style) |
|
| 78 |
-
| 9 | **Regime HMM** | `regime_detector.py` | Hidden Markov Model + strategy switching |
|
| 79 |
-
| 10 | **Risk Engine** | `risk_engine.py` | VaR (historical/parametric/Cornish-Fisher), CVaR, tail |
|
| 80 |
-
| 11 | **Factor Decomp** | `factor_decomposition.py` | Momentum, Value, Size, Vol, Quality factors |
|
| 81 |
-
| 12 | **Online Learning** | `online_learning.py` | Incremental SGD + concept drift detection |
|
| 82 |
-
| 13 | **Explainability** | `explainability.py` | SHAP-style feature importance |
|
| 83 |
-
| 14 | **Anomaly Det.** | `anomaly_detector.py` | Isolation Forest + Autoencoder |
|
| 84 |
-
| 15 | **Stress Test** | `stress_test.py` | 2008, COVID, Flash Crash, Monte Carlo |
|
| 85 |
-
| 16 | **Bayesian** | `bayesian_layer.py` | Probabilistic forecasts + shrinkage |
|
| 86 |
-
| 17 | **Hedging** | `hedging_engine.py` | Delta-neutral dynamic hedging |
|
| 87 |
-
| 18 | **Strategy Ensemble** | `strategy_ensemble.py` | Multi-strategy capital allocation |
|
| 88 |
-
| 19 | **Drawdown Ctrl** | `risk_engine.py` | Adaptive position scaling |
|
| 89 |
-
| 20 | **Orchestrator** | `main.py` | Wires everything together |
|
| 90 |
-
| 21 | **Original** | `main_original.py` | Original modular main script |
|
| 91 |
-
|
| 92 |
-
## π Quick Start
|
| 93 |
|
| 94 |
```bash
|
| 95 |
git clone https://huggingface.co/Premchan369/alphaforge-quant-system
|
| 96 |
cd alphaforge-quant-system
|
| 97 |
pip install -r requirements.txt
|
| 98 |
|
| 99 |
-
#
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
```
|
| 104 |
|
| 105 |
-
##
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
|
| 122 |
-
|
|
| 123 |
-
|
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
|
| 128 |
-
|
|
| 129 |
-
|
|
| 130 |
-
|
|
| 131 |
-
|
|
| 132 |
-
|
|
| 133 |
-
|
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# AlphaForge - Multi-Asset Quantitative Trading System v2.0
|
| 2 |
|
| 3 |
+
A comprehensive quantitative trading system combining real-time data streaming, advanced feature engineering, multi-model alpha signals, sentiment analysis, volatility forecasting, portfolio optimization, and ML options pricing.
|
| 4 |
|
| 5 |
+
## What's New in v2.0
|
| 6 |
|
| 7 |
+
### Real-Time Data
|
| 8 |
+
- **Alpaca Markets** WebSocket streaming (free tier, real-time IEX)
|
| 9 |
+
- **Polygon.io** professional WebSocket (NBBO, trades, aggregates)
|
| 10 |
+
- **Yahoo Finance** polling (free, 15-min delayed)
|
| 11 |
+
- **FRED macro data** (yield curve, VIX, credit spreads)
|
| 12 |
+
- **Live news streaming** with FinBERT sentiment processing
|
| 13 |
+
- **Order flow estimation** from tick data
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
+
### Advanced Feature Engineering (90+ Features)
|
| 16 |
+
- **Microstructure**: Amihud illiquidity, Kyle's lambda, bid-ask spread proxy, VWAP, Roll spread
|
| 17 |
+
- **Cross-sectional**: Momentum ranking, mean reversion, return dispersion
|
| 18 |
+
- **Macro overlay**: Yield curve (10Y-2Y spread, inversion), VIX regime, credit spreads
|
| 19 |
+
- **Stat-arb**: Cointegration spread, half-life, relative value
|
| 20 |
+
- **Regime detection**: Volatility regime, trend regime, liquidity regime
|
| 21 |
+
- **Advanced technicals**: Ichimoku, Supertrend, Keltner channels, Volume profile
|
| 22 |
+
|
| 23 |
+
### Online Learning
|
| 24 |
+
- **Drift detection**: Kolmogorov-Smirnov test, CUSUM change point
|
| 25 |
+
- **Adaptive retraining**: Automatic model update when drift detected
|
| 26 |
+
- **IC tracking**: Real-time information coefficient monitoring
|
| 27 |
+
|
| 28 |
+
## Architecture
|
| 29 |
|
| 30 |
```
|
| 31 |
+
Real-Time Data Feeds (Alpaca/Polygon/Yahoo)
|
| 32 |
+
|
|
| 33 |
+
ββββΊ Advanced Feature Engine (90+ features)
|
| 34 |
+
| βββ Microstructure (bid-ask, Kyle lambda, VWAP)
|
| 35 |
+
| βββ Cross-Sectional (momentum, dispersion)
|
| 36 |
+
| βββ Macro Overlay (VIX, yield curve, credit)
|
| 37 |
+
| βββ Regime Detection (vol/trend/liquidity)
|
| 38 |
+
| βββ Advanced Technicals (Ichimoku, Supertrend)
|
| 39 |
+
|
|
| 40 |
+
ββββΊ News Stream βββΊ FinBERT βββΊ Sentiment Alpha (S_t)
|
| 41 |
+
|
|
| 42 |
+
ββββΊ Alpha Model (LSTM + Transformer + XGBoost Ensemble)
|
| 43 |
+
| ββββΊ Combined Alpha = w1*Price Alpha + w2*Sentiment Alpha
|
| 44 |
+
|
|
| 45 |
+
ββββΊ Volatility Engine (GARCH + LSTM) βββΊ Covariance (Ξ£)
|
| 46 |
+
|
|
| 47 |
+
ββββΊ Portfolio Optimizer
|
| 48 |
+
| βββ Max Sharpe
|
| 49 |
+
| βββ Min Volatility
|
| 50 |
+
| βββ Robust Optimization
|
| 51 |
+
| βββ Black-Litterman
|
| 52 |
+
|
|
| 53 |
+
ββββΊ Backtest Engine βββΊ PnL, Sharpe, Sortino, Max DD
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
```
|
| 55 |
|
| 56 |
+
## Installation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
```bash
|
| 59 |
git clone https://huggingface.co/Premchan369/alphaforge-quant-system
|
| 60 |
cd alphaforge-quant-system
|
| 61 |
pip install -r requirements.txt
|
| 62 |
|
| 63 |
+
# Optional: For FRED macro data
|
| 64 |
+
export FRED_API_KEY=your_key_here
|
| 65 |
+
|
| 66 |
+
# Optional: For Alpaca real-time streaming
|
| 67 |
+
export ALPACA_API_KEY=your_key_here
|
| 68 |
+
export ALPACA_SECRET_KEY=your_secret_here
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
## Usage
|
| 72 |
+
|
| 73 |
+
### Full Backtest with Advanced Features
|
| 74 |
+
```bash
|
| 75 |
+
# Standard backtest
|
| 76 |
+
python main.py --mode backtest --start 2020-01-01 --end 2024-01-01
|
| 77 |
+
|
| 78 |
+
# With advanced features + macro + sentiment + online learning
|
| 79 |
+
python main.py --mode backtest --start 2020-01-01 --end 2024-01-01 \
|
| 80 |
+
--advanced-features --include-macro --include-sentiment --online-learning
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
### Real-Time Streaming
|
| 84 |
+
```bash
|
| 85 |
+
# Yahoo Finance (free, 15-min delayed)
|
| 86 |
+
python main.py --mode realtime --source yahoo --tickers SPY QQQ AAPL MSFT
|
| 87 |
+
|
| 88 |
+
# Alpaca (free tier, real-time IEX)
|
| 89 |
+
python main.py --mode realtime --source alpaca \
|
| 90 |
+
--api-key YOUR_KEY --secret-key YOUR_SECRET
|
| 91 |
+
|
| 92 |
+
# Polygon.io (professional)
|
| 93 |
+
python main.py --mode realtime --source polygon --api-key YOUR_KEY
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
### Train Model
|
| 97 |
+
```bash
|
| 98 |
+
python main.py --mode train --tickers SPY QQQ AAPL MSFT GOOGL AMZN META NVDA TSLA JPM \
|
| 99 |
+
--epochs 50 --advanced-features --include-macro
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
### Options Pricing
|
| 103 |
+
```bash
|
| 104 |
+
python main.py --mode options
|
| 105 |
```
|
| 106 |
|
| 107 |
+
## File Structure
|
| 108 |
+
|
| 109 |
+
| File | Description |
|
| 110 |
+
|------|-------------|
|
| 111 |
+
| `main.py` | Entry point - train, backtest, or real-time mode |
|
| 112 |
+
| `market_data.py` | OHLCV data fetching + basic features (RSI, MACD, BB) |
|
| 113 |
+
| `alpha_model.py` | LSTM/Transformer/XGBoost ensemble with IC tracking |
|
| 114 |
+
| `sentiment_model.py` | FinBERT sentiment with batch processing |
|
| 115 |
+
| `volatility_model.py` | GARCH(1,1) + LSTM volatility forecasting |
|
| 116 |
+
| `portfolio_optimizer.py` | Mean-variance, max-Sharpe, robust, Black-Litterman |
|
| 117 |
+
| `options_pricer.py` | ML options pricing + mispricing detection |
|
| 118 |
+
| `backtest_engine.py` | Full backtest with Sharpe, Sortino, max DD, regime detection |
|
| 119 |
+
| `advanced_features_part1.py` | Microstructure + cross-sectional features |
|
| 120 |
+
| `macro_features.py` | FRED macro + yield curve + VIX + credit spreads |
|
| 121 |
+
| `regime_features.py` | Volatility/trend/liquidity regime detection |
|
| 122 |
+
| `technical_indicators.py` | Ichimoku, Supertrend, Keltner, Volume Profile |
|
| 123 |
+
| `stat_arb_features.py` | Cointegration, spread, relative value, half-life |
|
| 124 |
+
| `online_learning.py` | Drift detection (KS, CUSUM) + adaptive retraining |
|
| 125 |
+
| `realtime_data.py` | Alpaca/Polygon/Yahoo streaming + news + order flow |
|
| 126 |
+
|
| 127 |
+
## Data Sources
|
| 128 |
+
|
| 129 |
+
| Source | Type | Cost | Real-Time |
|
| 130 |
+
|--------|------|------|-----------|
|
| 131 |
+
| **Yahoo Finance** | OHLCV + News | Free | 15min delayed |
|
| 132 |
+
| **Alpaca Markets** | Trades + Bars | Free tier | Real-time (IEX) |
|
| 133 |
+
| **Polygon.io** | NBBO + Trades + Aggs | Paid | Real-time |
|
| 134 |
+
| **FRED** | Macro (rates, VIX) | Free | Daily |
|
| 135 |
+
| **FMP** | News + Financials | Free tier | Daily |
|
| 136 |
+
| **FinBERT** | Sentiment | Free (local) | Batch |
|
| 137 |
+
|
| 138 |
+
## Metrics
|
| 139 |
+
|
| 140 |
+
| Metric | Description |
|
| 141 |
+
|--------|-------------|
|
| 142 |
+
| **IC** | Information Coefficient (rank correlation predicted vs actual) |
|
| 143 |
+
| **IC IR** | IC Information Ratio (mean IC / std IC) |
|
| 144 |
+
| **Sharpe** | Risk-adjusted return (excess return / volatility) |
|
| 145 |
+
| **Sortino** | Downside risk-adjusted return |
|
| 146 |
+
| **Max DD** | Maximum peak-to-trough decline |
|
| 147 |
+
| **Calmar** | Annualized return / max drawdown |
|
| 148 |
+
| **Alpha/Beta** | Excess return and market sensitivity |
|
| 149 |
+
| **Turnover** | Portfolio rebalance intensity |
|
| 150 |
+
|
| 151 |
+
## Research Backing
|
| 152 |
+
|
| 153 |
+
- **Alpha Models**: xLSTM-TS with wavelet denoising (Lopez Gil et al., 2024)
|
| 154 |
+
- **Sentiment**: FinBERT (Araci, 2019) with ChatGPT benchmarking
|
| 155 |
+
- **Volatility**: LSTM with skewed Student's t (MichaΕkow, 2025)
|
| 156 |
+
- **Portfolio**: Multi-task learning joint optimization (Ong & Herremans, 2023)
|
| 157 |
+
- **Options**: 5-layer FNN outperforming Black-Scholes (Berger et al., 2023)
|
| 158 |
+
- **Microstructure**: Amihud (2002), Kyle (1985), Corwin-Schultz (2012), Roll (1984)
|
| 159 |
+
- **Online Learning**: CUSUM change detection, KS drift test
|
| 160 |
+
|
| 161 |
+
## License
|
| 162 |
+
|
| 163 |
+
MIT
|