GitHub Actions commited on
Commit ·
7e96b08
1
Parent(s): 6d4734e
Sync from GitHub: 6ed9981d3cbd810f18c7f5d897bdfcb3420a9091
Browse files- README.md +110 -14
- app.py +273 -0
- hf_space/.gitattributes +35 -0
- hf_space/Dockerfile +20 -0
- hf_space/README.md +19 -0
- hf_space/requirements.txt +3 -0
- hf_space/src/streamlit_app.py +40 -0
- requirements.txt +29 -3
README.md
CHANGED
|
@@ -1,19 +1,115 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# P2-ETF-CNN-LSTM-ALTERNATIVE-APPROACHES
|
| 2 |
+
|
| 3 |
+
Macro-driven ETF rotation using three augmented CNN-LSTM variants.
|
| 4 |
+
Winner selected by **highest raw annualised return** on the out-of-sample test set.
|
| 5 |
+
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
## Architecture Overview
|
| 9 |
+
|
| 10 |
+
| Approach | Core Idea | Key Addition |
|
| 11 |
+
|---|---|---|
|
| 12 |
+
| **1 — Wavelet** | DWT decomposes each macro signal into frequency subbands before the CNN | Separates trend / cycle / noise |
|
| 13 |
+
| **2 — Regime-Conditioned** | HMM detects macro regimes; one-hot regime label concatenated into the network | Removes non-stationarity |
|
| 14 |
+
| **3 — Multi-Scale Parallel** | Three CNN towers (kernels 3, 7, 21 days) run in parallel before the LSTM | Captures momentum + cycle + trend simultaneously |
|
| 15 |
+
|
| 16 |
---
|
| 17 |
+
|
| 18 |
+
## ETF Universe
|
| 19 |
+
|
| 20 |
+
| Ticker | Description |
|
| 21 |
+
|---|---|
|
| 22 |
+
| TLT | 20+ Year Treasury Bond |
|
| 23 |
+
| TBT | 20+ Year Treasury Short (2×) |
|
| 24 |
+
| VNQ | Real Estate (REIT) |
|
| 25 |
+
| SLV | Silver |
|
| 26 |
+
| GLD | Gold |
|
| 27 |
+
| CASH | 3m T-bill rate (from HF dataset) |
|
| 28 |
+
|
| 29 |
+
Benchmarks (chart only, not traded): **SPY**, **AGG**
|
| 30 |
+
|
| 31 |
+
---
|
| 32 |
+
|
| 33 |
+
## Data
|
| 34 |
+
|
| 35 |
+
All data sourced exclusively from:
|
| 36 |
+
**`P2SAMAPA/fi-etf-macro-signal-master-data`** (HuggingFace Dataset)
|
| 37 |
+
File: `master_data.parquet`
|
| 38 |
+
|
| 39 |
+
No external API calls (no yfinance, no FRED).
|
| 40 |
+
The app checks daily whether the prior NYSE trading day's data is present in the dataset.
|
| 41 |
+
|
| 42 |
---
|
| 43 |
|
| 44 |
+
## Project Structure
|
| 45 |
+
|
| 46 |
+
```
|
| 47 |
+
├── .github/
|
| 48 |
+
│ └── workflows/
|
| 49 |
+
│ └── sync.yml # Auto-sync GitHub → HF Space on push to main
|
| 50 |
+
│
|
| 51 |
+
├── app.py # Streamlit orchestrator (UI wiring only)
|
| 52 |
+
│
|
| 53 |
+
├── data/
|
| 54 |
+
│ └── loader.py # HF dataset load, freshness check, column validation
|
| 55 |
+
│
|
| 56 |
+
├── models/
|
| 57 |
+
│ ├── base.py # Shared: sequences, splits, scaling, callbacks
|
| 58 |
+
│ ├── approach1_wavelet.py # Wavelet CNN-LSTM
|
| 59 |
+
│ ├── approach2_regime.py # Regime-Conditioned CNN-LSTM
|
| 60 |
+
│ └── approach3_multiscale.py # Multi-Scale Parallel CNN-LSTM
|
| 61 |
+
│
|
| 62 |
+
├── strategy/
|
| 63 |
+
│ └── backtest.py # execute_strategy, metrics, winner selection
|
| 64 |
+
│
|
| 65 |
+
├── signals/
|
| 66 |
+
│ └── conviction.py # Z-score conviction scoring
|
| 67 |
+
│
|
| 68 |
+
├── ui/
|
| 69 |
+
│ ├── components.py # Banner, conviction panel, metrics, audit trail
|
| 70 |
+
│ └── charts.py # Plotly equity curve + comparison bar chart
|
| 71 |
+
│
|
| 72 |
+
├── utils/
|
| 73 |
+
│ └── calendar.py # NYSE calendar, next trading day, EST time
|
| 74 |
+
│
|
| 75 |
+
├── requirements.txt
|
| 76 |
+
└── README.md
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## Secrets Required
|
| 82 |
+
|
| 83 |
+
| Secret | Where | Purpose |
|
| 84 |
+
|---|---|---|
|
| 85 |
+
| `HF_TOKEN` | GitHub + HF Space | Read HF dataset · Sync HF Space |
|
| 86 |
+
|
| 87 |
+
Set in:
|
| 88 |
+
- GitHub: `Settings → Secrets → Actions → New repository secret`
|
| 89 |
+
- HF Space: `Settings → Repository secrets`
|
| 90 |
+
|
| 91 |
+
---
|
| 92 |
+
|
| 93 |
+
## Deployment
|
| 94 |
+
|
| 95 |
+
Push to `main` → GitHub Actions (`sync.yml`) automatically syncs to HF Space.
|
| 96 |
+
|
| 97 |
+
### Local development
|
| 98 |
+
|
| 99 |
+
```bash
|
| 100 |
+
pip install -r requirements.txt
|
| 101 |
+
export HF_TOKEN=your_token
|
| 102 |
+
streamlit run app.py
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
---
|
| 106 |
|
| 107 |
+
## Output UI
|
| 108 |
|
| 109 |
+
1. **Data freshness warning** — alerts if prior NYSE trading day data is missing
|
| 110 |
+
2. **Next Trading Day Signal** — date + ETF from the winning approach
|
| 111 |
+
3. **Signal Conviction** — Z-score gauge + per-ETF probability bars
|
| 112 |
+
4. **Performance Metrics** — Annualised Return, Sharpe, Hit Ratio, Max DD
|
| 113 |
+
5. **Approach Comparison Table** — all three approaches side by side
|
| 114 |
+
6. **Equity Curves** — all three approaches + SPY + AGG benchmarks
|
| 115 |
+
7. **Audit Trail** — last 20 trading days for the winning approach
|
app.py
ADDED
|
@@ -0,0 +1,273 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
app.py
|
| 3 |
+
P2-ETF-CNN-LSTM-ALTERNATIVE-APPROACHES
|
| 4 |
+
Streamlit orchestrator — UI wiring only, no business logic here.
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import os
|
| 8 |
+
import streamlit as st
|
| 9 |
+
import pandas as pd
|
| 10 |
+
import numpy as np
|
| 11 |
+
|
| 12 |
+
# ── Module imports ────────────────────────────────────────────────────────────
|
| 13 |
+
from data.loader import load_dataset, check_data_freshness, get_features_and_targets, dataset_summary
|
| 14 |
+
from utils.calendar import get_est_time, is_sync_window, get_next_signal_date
|
| 15 |
+
from models.base import build_sequences, train_val_test_split, scale_features, returns_to_labels
|
| 16 |
+
from models.approach1_wavelet import train_approach1, predict_approach1
|
| 17 |
+
from models.approach2_regime import train_approach2, predict_approach2
|
| 18 |
+
from models.approach3_multiscale import train_approach3, predict_approach3
|
| 19 |
+
from strategy.backtest import execute_strategy, select_winner, build_comparison_table
|
| 20 |
+
from signals.conviction import compute_conviction
|
| 21 |
+
from ui.components import (
|
| 22 |
+
show_freshness_status, show_signal_banner, show_conviction_panel,
|
| 23 |
+
show_metrics_row, show_comparison_table, show_audit_trail,
|
| 24 |
+
)
|
| 25 |
+
from ui.charts import equity_curve_chart, comparison_bar_chart
|
| 26 |
+
|
| 27 |
+
# ── Page config ───────────────────────────────────────────────────────────────
|
| 28 |
+
st.set_page_config(
|
| 29 |
+
page_title="P2-ETF-CNN-LSTM",
|
| 30 |
+
page_icon="🧠",
|
| 31 |
+
layout="wide",
|
| 32 |
+
)
|
| 33 |
+
|
| 34 |
+
# ── Secrets ───────────────────────────────────────────────────────────────────
|
| 35 |
+
HF_TOKEN = os.getenv("HF_TOKEN", "")
|
| 36 |
+
|
| 37 |
+
# ── Sidebar ───────────────────────────────────────────────────────────────────
|
| 38 |
+
with st.sidebar:
|
| 39 |
+
st.header("⚙️ Configuration")
|
| 40 |
+
|
| 41 |
+
now_est = get_est_time()
|
| 42 |
+
st.write(f"🕒 **EST:** {now_est.strftime('%H:%M:%S')}")
|
| 43 |
+
if is_sync_window():
|
| 44 |
+
st.success("✅ Sync Window Active")
|
| 45 |
+
else:
|
| 46 |
+
st.info("⏸️ Sync Window Inactive")
|
| 47 |
+
|
| 48 |
+
st.divider()
|
| 49 |
+
|
| 50 |
+
start_yr = st.slider("📅 Start Year", 2010, 2024, 2016)
|
| 51 |
+
fee_bps = st.slider("💰 Fee (bps)", 0, 50, 10)
|
| 52 |
+
lookback = st.slider("📐 Lookback (days)", 20, 60, 30, step=5)
|
| 53 |
+
epochs = st.number_input("🔁 Max Epochs", 20, 300, 100, step=10)
|
| 54 |
+
|
| 55 |
+
st.divider()
|
| 56 |
+
|
| 57 |
+
split_option = st.selectbox("📊 Train/Val/Test Split", ["70/15/15", "80/10/10"], index=0)
|
| 58 |
+
split_map = {"70/15/15": (0.70, 0.15), "80/10/10": (0.80, 0.10)}
|
| 59 |
+
train_pct, val_pct = split_map[split_option]
|
| 60 |
+
|
| 61 |
+
include_cash = st.checkbox("💵 Include CASH class", value=True,
|
| 62 |
+
help="Model can select CASH (earns T-bill rate) as an alternative to any ETF")
|
| 63 |
+
|
| 64 |
+
st.divider()
|
| 65 |
+
|
| 66 |
+
run_button = st.button("🚀 Run All 3 Approaches", type="primary", use_container_width=True)
|
| 67 |
+
|
| 68 |
+
# ── Title ─────────────────────────────────────────────────────────────────────
|
| 69 |
+
st.title("🧠 P2-ETF-CNN-LSTM")
|
| 70 |
+
st.caption("Approach 1: Wavelet · Approach 2: Regime-Conditioned · Approach 3: Multi-Scale Parallel")
|
| 71 |
+
st.caption("Winner selected by highest raw annualised return on out-of-sample test set.")
|
| 72 |
+
|
| 73 |
+
# ── Load data (always, to check freshness) ────────────────────────────────────
|
| 74 |
+
if not HF_TOKEN:
|
| 75 |
+
st.error("❌ HF_TOKEN secret not found. Please add it to your HF Space / GitHub secrets.")
|
| 76 |
+
st.stop()
|
| 77 |
+
|
| 78 |
+
with st.spinner("📡 Loading dataset from HuggingFace..."):
|
| 79 |
+
df = load_dataset(HF_TOKEN)
|
| 80 |
+
|
| 81 |
+
if df.empty:
|
| 82 |
+
st.stop()
|
| 83 |
+
|
| 84 |
+
# ── Freshness check ───────────────────────────────────────────────────────────
|
| 85 |
+
freshness = check_data_freshness(df)
|
| 86 |
+
show_freshness_status(freshness)
|
| 87 |
+
|
| 88 |
+
# ── Dataset summary in sidebar ────────────────────────────────────────────────
|
| 89 |
+
with st.sidebar:
|
| 90 |
+
st.divider()
|
| 91 |
+
st.subheader("📦 Dataset Info")
|
| 92 |
+
summary = dataset_summary(df)
|
| 93 |
+
if summary:
|
| 94 |
+
st.write(f"**Rows:** {summary['rows']:,}")
|
| 95 |
+
st.write(f"**Range:** {summary['start_date']} → {summary['end_date']}")
|
| 96 |
+
st.write(f"**ETFs:** {', '.join([e.replace('_Ret','') for e in summary['etfs_found']])}")
|
| 97 |
+
st.write(f"**Benchmarks:** {', '.join([b.replace('_Ret','') for b in summary['benchmarks']])}")
|
| 98 |
+
st.write(f"**T-bill col:** {'✅' if summary['tbill_found'] else '❌'}")
|
| 99 |
+
|
| 100 |
+
# ── Main execution ────────────────────────────────────────────────────────────
|
| 101 |
+
if not run_button:
|
| 102 |
+
st.info("👈 Configure parameters in the sidebar and click **🚀 Run All 3 Approaches** to begin.")
|
| 103 |
+
st.stop()
|
| 104 |
+
|
| 105 |
+
# ── Filter by start year ──────────────────────────────────────────────────────
|
| 106 |
+
df = df[df.index.year >= start_yr].copy()
|
| 107 |
+
st.write(f"📅 **Data:** {df.index[0].strftime('%Y-%m-%d')} → {df.index[-1].strftime('%Y-%m-%d')} "
|
| 108 |
+
f"({df.index[-1].year - df.index[0].year + 1} years)")
|
| 109 |
+
|
| 110 |
+
# ── Feature / target extraction ───────────────────────────────────────────────
|
| 111 |
+
try:
|
| 112 |
+
input_features, target_etfs, tbill_rate = get_features_and_targets(df)
|
| 113 |
+
except ValueError as e:
|
| 114 |
+
st.error(str(e))
|
| 115 |
+
st.stop()
|
| 116 |
+
|
| 117 |
+
st.info(f"🎯 **Targets:** {len(target_etfs)} ETFs · **Features:** {len(input_features)} signals · "
|
| 118 |
+
f"**T-bill rate:** {tbill_rate*100:.2f}%")
|
| 119 |
+
|
| 120 |
+
# ── Prepare sequences ─────────────────────────────────────────────────────────
|
| 121 |
+
X_raw = df[input_features].values.astype(np.float32)
|
| 122 |
+
y_raw = df[target_etfs].values.astype(np.float32)
|
| 123 |
+
n_etfs = len(target_etfs)
|
| 124 |
+
n_classes = n_etfs + (1 if include_cash else 0) # +1 for CASH
|
| 125 |
+
|
| 126 |
+
# Fill NaNs with column means
|
| 127 |
+
col_means = np.nanmean(X_raw, axis=0)
|
| 128 |
+
for j in range(X_raw.shape[1]):
|
| 129 |
+
mask = np.isnan(X_raw[:, j])
|
| 130 |
+
X_raw[mask, j] = col_means[j]
|
| 131 |
+
|
| 132 |
+
X_seq, y_seq = build_sequences(X_raw, y_raw, lookback)
|
| 133 |
+
y_labels = returns_to_labels(y_seq, include_cash=include_cash)
|
| 134 |
+
|
| 135 |
+
X_train, y_train_r, X_val, y_val_r, X_test, y_test_r = train_val_test_split(X_seq, y_seq, train_pct, val_pct)
|
| 136 |
+
_, y_train_l, _, y_val_l, _, y_test_l = train_val_test_split(X_seq, y_labels, train_pct, val_pct)
|
| 137 |
+
|
| 138 |
+
X_train_s, X_val_s, X_test_s, _ = scale_features(X_train, X_val, X_test)
|
| 139 |
+
|
| 140 |
+
train_size = len(X_train)
|
| 141 |
+
val_size = len(X_val)
|
| 142 |
+
|
| 143 |
+
# Test dates (aligned with y_test)
|
| 144 |
+
test_start = lookback + train_size + val_size
|
| 145 |
+
test_dates = df.index[test_start: test_start + len(X_test)]
|
| 146 |
+
test_slice = slice(test_start, test_start + len(X_test))
|
| 147 |
+
|
| 148 |
+
st.success(f"✅ Sequences — Train: {train_size} · Val: {val_size} · Test: {len(X_test)}")
|
| 149 |
+
|
| 150 |
+
# ── Train all three approaches ────────────────────────────────────────────────
|
| 151 |
+
results = {}
|
| 152 |
+
trained_info = {} # store extra info needed for conviction
|
| 153 |
+
|
| 154 |
+
progress = st.progress(0, text="Starting training...")
|
| 155 |
+
|
| 156 |
+
# ── Approach 1: Wavelet ───────────────────────────────────────────────────────
|
| 157 |
+
with st.spinner("🌊 Training Approach 1 — Wavelet CNN-LSTM..."):
|
| 158 |
+
try:
|
| 159 |
+
model1, hist1, _ = train_approach1(
|
| 160 |
+
X_train_s, y_train_l,
|
| 161 |
+
X_val_s, y_val_l,
|
| 162 |
+
n_classes=n_classes, epochs=int(epochs),
|
| 163 |
+
)
|
| 164 |
+
preds1, proba1 = predict_approach1(model1, X_test_s)
|
| 165 |
+
results["Approach 1"] = execute_strategy(
|
| 166 |
+
preds1, proba1, y_test_r, test_dates, target_etfs, fee_bps, tbill_rate, include_cash,
|
| 167 |
+
)
|
| 168 |
+
trained_info["Approach 1"] = {"proba": proba1}
|
| 169 |
+
st.success("✅ Approach 1 complete")
|
| 170 |
+
except Exception as e:
|
| 171 |
+
st.warning(f"⚠️ Approach 1 failed: {e}")
|
| 172 |
+
results["Approach 1"] = None
|
| 173 |
+
|
| 174 |
+
progress.progress(33, text="Approach 1 done...")
|
| 175 |
+
|
| 176 |
+
# ── Approach 2: Regime-Conditioned ───────────────────────────────────────────
|
| 177 |
+
with st.spinner("🔀 Training Approach 2 — Regime-Conditioned CNN-LSTM..."):
|
| 178 |
+
try:
|
| 179 |
+
model2, hist2, hmm2, regime_cols2 = train_approach2(
|
| 180 |
+
X_train_s, y_train_l,
|
| 181 |
+
X_val_s, y_val_l,
|
| 182 |
+
X_flat_all=X_raw,
|
| 183 |
+
feature_names=input_features,
|
| 184 |
+
lookback=lookback,
|
| 185 |
+
train_size=train_size,
|
| 186 |
+
val_size=val_size,
|
| 187 |
+
n_classes=n_classes, epochs=int(epochs),
|
| 188 |
+
)
|
| 189 |
+
preds2, proba2 = predict_approach2(
|
| 190 |
+
model2, X_test_s, X_raw, regime_cols2, hmm2,
|
| 191 |
+
lookback, train_size, val_size,
|
| 192 |
+
)
|
| 193 |
+
results["Approach 2"] = execute_strategy(
|
| 194 |
+
preds2, proba2, y_test_r, test_dates, target_etfs, fee_bps, tbill_rate, include_cash,
|
| 195 |
+
)
|
| 196 |
+
trained_info["Approach 2"] = {"proba": proba2}
|
| 197 |
+
st.success("✅ Approach 2 complete")
|
| 198 |
+
except Exception as e:
|
| 199 |
+
st.warning(f"⚠️ Approach 2 failed: {e}")
|
| 200 |
+
results["Approach 2"] = None
|
| 201 |
+
|
| 202 |
+
progress.progress(66, text="Approach 2 done...")
|
| 203 |
+
|
| 204 |
+
# ── Approach 3: Multi-Scale ───────────────────────────────────────────────────
|
| 205 |
+
with st.spinner("📡 Training Approach 3 — Multi-Scale CNN-LSTM..."):
|
| 206 |
+
try:
|
| 207 |
+
model3, hist3 = train_approach3(
|
| 208 |
+
X_train_s, y_train_l,
|
| 209 |
+
X_val_s, y_val_l,
|
| 210 |
+
n_classes=n_classes, epochs=int(epochs),
|
| 211 |
+
)
|
| 212 |
+
preds3, proba3 = predict_approach3(model3, X_test_s)
|
| 213 |
+
results["Approach 3"] = execute_strategy(
|
| 214 |
+
preds3, proba3, y_test_r, test_dates, target_etfs, fee_bps, tbill_rate, include_cash,
|
| 215 |
+
)
|
| 216 |
+
trained_info["Approach 3"] = {"proba": proba3}
|
| 217 |
+
st.success("✅ Approach 3 complete")
|
| 218 |
+
except Exception as e:
|
| 219 |
+
st.warning(f"⚠️ Approach 3 failed: {e}")
|
| 220 |
+
results["Approach 3"] = None
|
| 221 |
+
|
| 222 |
+
progress.progress(100, text="All approaches complete!")
|
| 223 |
+
progress.empty()
|
| 224 |
+
|
| 225 |
+
# ── Select winner ─────────────────────────────────────────────────────────────
|
| 226 |
+
winner_name = select_winner(results)
|
| 227 |
+
winner_res = results.get(winner_name)
|
| 228 |
+
|
| 229 |
+
if winner_res is None:
|
| 230 |
+
st.error("❌ All approaches failed. Please check your data and configuration.")
|
| 231 |
+
st.stop()
|
| 232 |
+
|
| 233 |
+
# ── Next trading date ─────────────────────────────────────────────────────────
|
| 234 |
+
next_date = get_next_signal_date()
|
| 235 |
+
|
| 236 |
+
st.divider()
|
| 237 |
+
|
| 238 |
+
# ── Signal banner (winner) ────────────────────────────────────────────────────
|
| 239 |
+
show_signal_banner(winner_res["next_signal"], next_date, winner_name)
|
| 240 |
+
|
| 241 |
+
# ── Conviction panel ──────────────────────────────────────────────────────────
|
| 242 |
+
winner_proba = trained_info[winner_name]["proba"]
|
| 243 |
+
conviction = compute_conviction(winner_proba[-1], target_etfs, include_cash)
|
| 244 |
+
show_conviction_panel(conviction)
|
| 245 |
+
|
| 246 |
+
st.divider()
|
| 247 |
+
|
| 248 |
+
# ── Winner metrics ────────────────────────────────────────────────────────────
|
| 249 |
+
st.subheader(f"📊 {winner_name} — Performance Metrics")
|
| 250 |
+
show_metrics_row(winner_res, tbill_rate)
|
| 251 |
+
|
| 252 |
+
st.divider()
|
| 253 |
+
|
| 254 |
+
# ── Comparison table ──────────────────────────────────────────────────────────
|
| 255 |
+
st.subheader("🏆 Approach Comparison (Winner = Highest Raw Annualised Return)")
|
| 256 |
+
comparison_df = build_comparison_table(results, winner_name)
|
| 257 |
+
show_comparison_table(comparison_df)
|
| 258 |
+
|
| 259 |
+
# ── Comparison bar chart ──────────────────────────────────────────────────────
|
| 260 |
+
st.plotly_chart(comparison_bar_chart(results, winner_name), use_container_width=True)
|
| 261 |
+
|
| 262 |
+
st.divider()
|
| 263 |
+
|
| 264 |
+
# ── Equity curves ─────────────────────────────────────────────────────────────
|
| 265 |
+
st.subheader("📈 Out-of-Sample Equity Curves — All Approaches vs Benchmarks")
|
| 266 |
+
fig = equity_curve_chart(results, winner_name, test_dates, df, test_slice, tbill_rate)
|
| 267 |
+
st.plotly_chart(fig, use_container_width=True)
|
| 268 |
+
|
| 269 |
+
st.divider()
|
| 270 |
+
|
| 271 |
+
# ── Audit trail (winner) ──────────────────────────────────────────────────────
|
| 272 |
+
st.subheader(f"📋 Audit Trail — {winner_name} (Last 20 Trading Days)")
|
| 273 |
+
show_audit_trail(winner_res["audit_trail"])
|
hf_space/.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
hf_space/Dockerfile
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
FROM python:3.13.5-slim
|
| 2 |
+
|
| 3 |
+
WORKDIR /app
|
| 4 |
+
|
| 5 |
+
RUN apt-get update && apt-get install -y \
|
| 6 |
+
build-essential \
|
| 7 |
+
curl \
|
| 8 |
+
git \
|
| 9 |
+
&& rm -rf /var/lib/apt/lists/*
|
| 10 |
+
|
| 11 |
+
COPY requirements.txt ./
|
| 12 |
+
COPY src/ ./src/
|
| 13 |
+
|
| 14 |
+
RUN pip3 install -r requirements.txt
|
| 15 |
+
|
| 16 |
+
EXPOSE 8501
|
| 17 |
+
|
| 18 |
+
HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
|
| 19 |
+
|
| 20 |
+
ENTRYPOINT ["streamlit", "run", "src/streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]
|
hf_space/README.md
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: P2 ETF CNN LSTM ALTERNATIVE APPROACHES
|
| 3 |
+
emoji: 🚀
|
| 4 |
+
colorFrom: red
|
| 5 |
+
colorTo: red
|
| 6 |
+
sdk: docker
|
| 7 |
+
app_port: 8501
|
| 8 |
+
tags:
|
| 9 |
+
- streamlit
|
| 10 |
+
pinned: false
|
| 11 |
+
short_description: Streamlit template space
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# Welcome to Streamlit!
|
| 15 |
+
|
| 16 |
+
Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
|
| 17 |
+
|
| 18 |
+
If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
|
| 19 |
+
forums](https://discuss.streamlit.io).
|
hf_space/requirements.txt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
altair
|
| 2 |
+
pandas
|
| 3 |
+
streamlit
|
hf_space/src/streamlit_app.py
ADDED
|
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import altair as alt
|
| 2 |
+
import numpy as np
|
| 3 |
+
import pandas as pd
|
| 4 |
+
import streamlit as st
|
| 5 |
+
|
| 6 |
+
"""
|
| 7 |
+
# Welcome to Streamlit!
|
| 8 |
+
|
| 9 |
+
Edit `/streamlit_app.py` to customize this app to your heart's desire :heart:.
|
| 10 |
+
If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
|
| 11 |
+
forums](https://discuss.streamlit.io).
|
| 12 |
+
|
| 13 |
+
In the meantime, below is an example of what you can do with just a few lines of code:
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
num_points = st.slider("Number of points in spiral", 1, 10000, 1100)
|
| 17 |
+
num_turns = st.slider("Number of turns in spiral", 1, 300, 31)
|
| 18 |
+
|
| 19 |
+
indices = np.linspace(0, 1, num_points)
|
| 20 |
+
theta = 2 * np.pi * num_turns * indices
|
| 21 |
+
radius = indices
|
| 22 |
+
|
| 23 |
+
x = radius * np.cos(theta)
|
| 24 |
+
y = radius * np.sin(theta)
|
| 25 |
+
|
| 26 |
+
df = pd.DataFrame({
|
| 27 |
+
"x": x,
|
| 28 |
+
"y": y,
|
| 29 |
+
"idx": indices,
|
| 30 |
+
"rand": np.random.randn(num_points),
|
| 31 |
+
})
|
| 32 |
+
|
| 33 |
+
st.altair_chart(alt.Chart(df, height=700, width=700)
|
| 34 |
+
.mark_point(filled=True)
|
| 35 |
+
.encode(
|
| 36 |
+
x=alt.X("x", axis=None),
|
| 37 |
+
y=alt.Y("y", axis=None),
|
| 38 |
+
color=alt.Color("idx", legend=None, scale=alt.Scale()),
|
| 39 |
+
size=alt.Size("rand", legend=None, scale=alt.Scale(range=[1, 150])),
|
| 40 |
+
))
|
requirements.txt
CHANGED
|
@@ -1,3 +1,29 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Core
|
| 2 |
+
streamlit>=1.32.0
|
| 3 |
+
pandas>=2.0.0
|
| 4 |
+
numpy>=1.24.0
|
| 5 |
+
|
| 6 |
+
# Hugging Face
|
| 7 |
+
huggingface_hub>=0.21.0
|
| 8 |
+
datasets>=2.18.0
|
| 9 |
+
|
| 10 |
+
# Machine Learning
|
| 11 |
+
tensorflow>=2.14.0
|
| 12 |
+
scikit-learn>=1.3.0
|
| 13 |
+
xgboost>=2.0.0
|
| 14 |
+
|
| 15 |
+
# Wavelet (Approach 1)
|
| 16 |
+
PyWavelets>=1.5.0
|
| 17 |
+
|
| 18 |
+
# Regime detection (Approach 2)
|
| 19 |
+
hmmlearn>=0.3.0
|
| 20 |
+
|
| 21 |
+
# Visualisation
|
| 22 |
+
plotly>=5.18.0
|
| 23 |
+
|
| 24 |
+
# NYSE Calendar
|
| 25 |
+
pandas_market_calendars>=4.3.0
|
| 26 |
+
pytz>=2024.1
|
| 27 |
+
|
| 28 |
+
# Parquet
|
| 29 |
+
pyarrow>=14.0.0
|