File size: 13,167 Bytes
f319933
 
 
 
 
 
 
 
 
 
 
2fdeb16
 
f319933
 
 
5d5ab7d
2fdeb16
5d5ab7d
2fdeb16
f319933
 
5d5ab7d
708e8df
5d5ab7d
f319933
65cce92
f319933
 
 
 
 
e91d63c
708e8df
 
f319933
5d5ab7d
f319933
 
2fdeb16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5d5ab7d
708e8df
 
f319933
 
 
 
2fdeb16
f319933
 
2fdeb16
f319933
 
b24a9f7
 
 
 
 
f319933
65cce92
708e8df
f319933
e91d63c
 
2fdeb16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f319933
 
 
 
2fdeb16
f319933
 
 
2fdeb16
f319933
2fdeb16
 
 
f319933
 
 
2fdeb16
 
 
f319933
2fdeb16
 
f319933
2fdeb16
f319933
 
2fdeb16
f319933
2fdeb16
 
 
f319933
 
 
2fdeb16
 
 
 
 
 
 
 
 
 
e91d63c
708e8df
e91d63c
f319933
708e8df
f319933
 
 
 
 
 
 
 
 
 
 
 
b24a9f7
 
 
 
708e8df
 
 
2fdeb16
e91d63c
2fdeb16
708e8df
2fdeb16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65cce92
f319933
 
 
65cce92
f319933
 
 
 
65cce92
f319933
 
 
 
65cce92
f319933
 
 
e91d63c
 
65cce92
708e8df
f319933
708e8df
2fdeb16
f319933
2fdeb16
 
708e8df
2fdeb16
 
 
 
 
 
 
 
f319933
708e8df
2fdeb16
f319933
2fdeb16
 
f319933
708e8df
2fdeb16
f319933
2fdeb16
f319933
708e8df
65cce92
708e8df
f319933
 
 
708e8df
f319933
 
 
 
5d5ab7d
708e8df
 
f319933
708e8df
f319933
708e8df
 
 
f319933
708e8df
f319933
 
 
65cce92
 
708e8df
2fdeb16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
---
license: mit
tags:
- quant-trading
- alpha-model
- portfolio-optimization
- volatility-forecasting
- sentiment-analysis
- machine-learning
- financial-ai
- k2-think-v2
- multi-market
- cross-asset
language:
- en
---

# AlphaForge v3.1 โ€” Multi-Market Institutional-Grade Quantitative Trading System

> **A research-backed, modular, institutional-grade quantitative trading framework supporting 9 global markets.**
> 
> Built for the [Build with K2 Think V2 Challenge](https://build.k2think.ai/) by MBZUAI.

---

## ๐Ÿš€ Quick Start

```bash
git clone https://huggingface.co/Premchan369/alphaforge-quant-system
pip install -r requirements.txt
python main.py --mode full --tickers SPY QQQ AAPL
```

---

## ๐Ÿ“Š Live Demo

**[AlphaForge x K2 Think V2 โ€” Interactive Gradio Space](https://huggingface.co/spaces/Premchan369/alphaforge-k2think)**

Features: real-time multi-market analysis (US, UK, DE, JP, CN, IN, Crypto, Forex, Commodities), AI deep analysis, cross-market portfolio optimization, and direct AI chat.

---

## ๐ŸŒ Multi-Market Coverage

| Market | Suffix | Examples | Currency | Session |
|--------|--------|----------|----------|---------|
| ๐Ÿ‡บ๐Ÿ‡ธ **US Equities** | (none) | AAPL, TSLA, SPY, NVDA | USD | 09:30-16:00 ET |
| ๐Ÿ‡ฌ๐Ÿ‡ง **UK Equities** | .L | SHEL.L, ULVR.L, AZN.L | GBP | 08:00-16:30 GMT |
| ๐Ÿ‡ฉ๐Ÿ‡ช **Germany Equities** | .DE | SAP.DE, SIE.DE, ALV.DE | EUR | 09:00-17:30 CET |
| ๐Ÿ‡ฏ๐Ÿ‡ต **Japan Equities** | .T | 7203.T, 9984.T, 6758.T | JPY | 09:00-15:00 JST |
| ๐Ÿ‡จ๐Ÿ‡ณ **China Equities** | .SS/.SZ | 600519.SS, 000858.SZ | CNY | 09:30-15:00 CST |
| ๐Ÿ‡ฎ๐Ÿ‡ณ **India Equities** | .NS | RELIANCE.NS, TCS.NS, INFY.NS | INR | 09:15-15:30 IST |
| โ‚ฟ **Crypto** | -USD | BTC-USD, ETH-USD, SOL-USD | USD | 24/7 |
| ๐Ÿ’ฑ **Forex** | =X | EURUSD=X, GBPUSD=X, USDJPY=X | USD | 24/5 |
| ๐Ÿ›ข **Commodities** | =F | GC=F, CL=F, SI=F | USD | 08:20-13:30 ET |

### Cross-Market Portfolio Optimization
The system supports **mixed-asset portfolios** across all markets simultaneously:

```
Example: AAPL (US) + BTC-USD (Crypto) + EURUSD=X (Forex) + GC=F (Commodities) + SHEL.L (UK)
```

Auto-detection of market from symbol suffixes enables seamless multi-asset analysis.

---

## ๐Ÿง  What This Project Is

**AlphaForge** is an institutional-grade quantitative trading system built as a modular open-source Python framework. It was created to:

- Predict multi-asset expected returns (ฮผ) across **9 global markets**
- Analyze financial sentiment via FinBERT and LLM embeddings
- Forecast volatility (ฯƒ) and covariance matrices (ฮฃ)
- Optimize **cross-market portfolios** with real-world constraints
- Price options with ML (beating Black-Scholes)
- Run **honest** backtests with walk-forward validation
- Control drawdowns with CPPI and Kelly criterion
- Guard against data snooping bias
- Detect market regimes and adapt strategies
- Measure liquidity risk and position capacity
- Model transaction costs with market impact

---

## ๐Ÿ— Architecture

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    MULTI-MARKET DATA LAYER                     โ”‚
โ”‚  US โ”‚ UK โ”‚ DE โ”‚ JP โ”‚ CN โ”‚ IN โ”‚ Crypto โ”‚ Forex โ”‚ Commodities โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    MARKET-SPECIFIC NORMALIZATION                 โ”‚
โ”‚  Suffix handling โ”‚ Currency โ”‚ Session timing โ”‚ Local holidays   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    UNIFIED ANALYSIS PIPELINE                    โ”‚
โ”‚  Technical Indicators โ”‚ Regime Detection โ”‚ Risk Metrics       โ”‚
โ”‚  Position Sizing โ”‚ Liquidity Analysis โ”‚ Event Calendar          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    CROSS-MARKET PORTFOLIO                        โ”‚
โ”‚  Auto-detect market โ”‚ Mixed-asset optimization โ”‚ Tx cost model โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

---

## ๐Ÿ“ Module Overview (33+ Modules)

| Module | Purpose | Research Basis |
|--------|---------|--------------|
| `market_data.py` | Multi-market OHLCV fetching with suffix normalization | Standard TA |
| `sentiment_model.py` | FinBERT / LLM embeddings for financial sentiment | Yang et al. 2020 (FinBERT) |
| `alpha_model.py` | XGBoost + LSTM expected return prediction | Gu et al. 2020 |
| `volatility_model.py` | GARCH baseline + LSTM volatility forecasting | Michankow 2025 |
| `portfolio_optimizer.py` | Mean-variance with constraints, Black-Litterman | Markowitz 1952 |
| `options_model.py` | ML option pricing (5-layer FNN beats BS) | Berger et al. 2023 |
| `backtest_engine.py` | Honest backtesting with transaction costs | Lopez de Prado 2018 |
| `walk_forward_validation.py` | Expanding/sliding/purged/CPCV splits | Lopez de Prado 2018/2019 |
| `wavelet_denoising.py` | Wavelet noise reduction for time series | Lopez Gil 2024 |
| `alpha_mining.py` | Genetic programming + LLM-driven factor discovery | gplearn |
| `multi_task_learning.py` | Joint optimization: alpha + vol + portfolio | Ong & Herremans 2023 |
| `execution_algorithms.py` | TWAP, VWAP, Smart Order Router, Almgren-Chriss | Almgren & Chriss 2001 |
| `risk_management.py` | VaR/CVaR (hist/parametric/MC), stress tests | Jorion 2006 |
| `market_microstructure.py` | Kyle's lambda, VPIN, Roll measure, OFI, Amihud | Kyle 1985 |
| `hyperparameter_sweep.py` | Grid, random, Latin Hypercube sampling | Bergstra & Bengio 2012 |
| `gpu_optimization.py` | Flash Attention, AMP, gradient checkpointing | PyTorch best practices |
| `rl_execution.py` | PPO-based Deep Hedging optimal execution | Buehler et al. 2019 |
| `limit_order_book.py` | Level 2 LOB reconstruction, synthetic message feeds | Gould et al. 2013 |
| `market_making.py` | Avellaneda-Stoikov quoting, adverse selection | Avellaneda & Stoikov 2008 |
| `synthetic_market_sim.py` | Agent-based modeling, regime switching | LeBaron 2006 |
| `online_learning.py` | Per-symbol adaptive models, concept drift | Gama et al. 2014 |
| `stat_arb.py` | Cointegration, PCA mean-reversion, lead-lag | Gatev et al. 2006 |
| `conformal_prediction.py` | Distribution-free prediction intervals | Shafer & Vovk 2008 |
| `feature_store.py` | Microsecond feature computation, per-feature drift | Feature Store best practices |
| `adversarial_defense.py` | FGSM attacks, model watermarking, evasion monitoring | Goodfellow et al. 2015 |
| `ab_testing.py` | Sequential testing, multiple comparison correction | Johari et al. 2022 |
| `correlation_regime.py` | DCC-GARCH dynamic correlations, Ledoit-Wolf shrinkage | Engle 2002 |
| `news_data_integration.py` | NewsAPI, RSS, GDELT, Reddit/StockTwits aggregation | Alternative data |
| `regime_detection.py` | HMM/GMM market regime classifier, regime-conditioned Sharpe | Hamilton 1989 |
| `transaction_cost_model.py` | Square-root market impact, spread, fees, optimal participation | Almgren et al. 2005 |
| `drawdown_control.py` | CPPI insurance, fractional Kelly, dynamic leverage | Perold & Sharpe 1988 |
| `liquidity_risk.py` | Amihud illiquidity, Kyle's lambda, VPIN, position capacity | Amihud 2002 |
| `data_snooping_guard.py` | White's Reality Check, FDR, Bonferroni/Holm | White 2000 |
| `event_study.py` | Post-earnings drift, macro events, merger arbitrage | MacKinlay 1997 |
| `cross_sectional_factors.py` | Fama-French 5-factor, momentum, quality, low-vol | Fama & French 2015 |
| `factor_risk_model.py` | Barra-style multi-factor risk decomposition | Grinold & Kahn 2000 |

---

## ๐Ÿ“ˆ Key Metrics & Scoring

| Metric | Description | Target |
|--------|-------------|--------|
| **Sharpe Ratio** | Risk-adjusted return | > 1.0 |
| **Sortino Ratio** | Downside risk-adjusted return | > 1.5 |
| **Information Coefficient (IC)** | Predicted vs actual return correlation | > 0.05 |
| **Max Drawdown** | Worst peak-to-trough decline | < -20% |
| **VaR (95%)** | Value at Risk | Reported |
| **CVaR (95%)** | Conditional VaR / Expected Shortfall | Reported |
| **Calmar Ratio** | Return / Max Drawdown | > 1.0 |
| **Win Rate** | % of positive return days | Reported |
| **Profit Factor** | Gross profit / Gross loss | > 1.2 |
| **GOAT Score** | Composite 0-100 scoring system | > 70 |
| **Regime-Conditioned Sharpe** | Sharpe in current market regime | Contextual |
| **Transaction Cost Drag** | Annualized cost of trading | < 2% |
| **Liquidity Score** | Amihud illiquidity ranking | Reported |
| **Kelly Fraction** | Optimal leverage for growth | < 1.0 (practical) |

---

## ๐Ÿ“š Research Foundation

Every major component is backed by published research:

| Component | Citation | Key Finding |
|-----------|----------|-------------|
| Wavelet Denoising | Lopez Gil 2024 (xLSTM-TS) | `db4` + soft thresholding |
| Multi-Task Learning | Ong & Herremans 2023 (MTL-TSMOM) | Joint MTL with negative Sharpe loss |
| Walk-Forward Validation | Lopez de Prado 2018/2019 | Purged CV + combinatorial CPCV |
| Options Pricing | Berger et al. 2023 | 5-layer FNN beats Black-Scholes |
| Volatility | Michankow 2025 | Skewed Student's t LSTM |
| RL Execution | Buehler et al. 2019 | Deep Hedging (PPO) |
| Market Making | Avellaneda & Stoikov 2008 | Inventory management |
| Correlation Regimes | Engle 2002 | DCC-GARCH dynamic correlations |
| Regime Detection | Hamilton 1989 | HMM for nonstationary time series |
| Transaction Costs | Almgren et al. 2005 | Square-root market impact law |
| Drawdown Control | Perold & Sharpe 1988 | CPPI dynamic asset allocation |
| Kelly Criterion | Thorp 2006 | Fractional Kelly for practical trading |
| Liquidity Risk | Amihud 2002 | Illiquidity premium via price impact ratio |
| Data Snooping | White 2000 | Bootstrap reality check for multiple testing |
| Event Studies | MacKinlay 1997 | Abnormal return methodology |
| Fama-French Factors | Fama & French 2015 | 5-factor asset pricing model |
| Factor Risk | Grinold & Kahn 2000 | Multi-factor risk decomposition |
| Cross-Market Arbitrage | Gatev et al. 2006 | Pairs trading with cointegration |

---

## ๐Ÿ›  Installation

### Core Dependencies
```bash
pip install -r requirements.txt
```

### Optional Dependencies (for advanced modules)
```bash
pip install gplearn PyWavelets feedparser praw arch requests
```

### GPU Support
```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

---

## ๐Ÿ“– Usage

### Single Market Analysis
```bash
# US Equity
python main.py --mode full --tickers AAPL --market US

# UK Equity
python main.py --mode full --tickers SHEL.L --market UK

# Crypto
python main.py --mode full --tickers BTC-USD --market Crypto

# Forex
python main.py --mode full --tickers EURUSD=X --market Forex
```

### Cross-Market Portfolio Optimization
```bash
# Mixed-asset portfolio across 4 markets
python main.py --mode portfolio --tickers AAPL,BTC-USD,EURUSD=X,GC=F,SHEL.L
```

### Walk-Forward Backtest
```bash
python main.py --mode walkforward --tickers AAPL TSLA NVDA --market US
```

---

## ๐Ÿค Contributing

This is an open-source project. Contributions welcome:

1. Fork the repository
2. Create a feature branch
3. Submit a PR with tests
4. Follow the research-first philosophy

---

## ๐Ÿ“ License

MIT License โ€” see LICENSE

---

## ๐Ÿ™ Acknowledgments

- Built for the **Build with K2 Think V2 Challenge** by [MBZUAI](https://mbzuai.ac.ae/)
- K2 Think V2 model by [MBZUAI-IFM](https://huggingface.co/MBZUAI-IFM)
- Research inspiration from Marcos Lopez de Prado, Avellaneda & Stoikov, and the quantitative finance community

---

*Built by Premchan | AlphaForge v3.1 | 33+ Quant Modules | 9 Global Markets | Institutional-Grade Trading*