File size: 17,552 Bytes
36dada9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23c102d
 
 
 
36dada9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23c102d
36dada9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a746412
 
 
23c102d
a746412
36dada9
 
 
a746412
23c102d
a746412
 
 
 
36dada9
9fe4fad
 
36dada9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9fe4fad
 
36dada9
 
 
 
 
 
 
23c102d
 
36dada9
15345ca
23c102d
 
 
 
 
 
 
 
36dada9
 
 
 
 
 
 
 
 
 
 
 
cc69372
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15345ca
cc69372
 
 
 
 
 
11a500d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36dada9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15345ca
23c102d
782ac40
36dada9
 
 
 
 
11a500d
36dada9
 
 
 
782ac40
9fe4fad
 
36dada9
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
---
title: Analytics Modules
summary: What analytics are available today, how they are shaped, and which ones are integrated into the UI and APIs.
read_when:
  - Computing technical indicators (RSI, MACD, MA, Bollinger, volatility, spectral)
  - Running fundamental valuations (DCF)
  - Analyzing options gamma exposure
  - Optimizing portfolios (Black-Litterman)
  - Simulating price paths (GBM)
---

# Analytics Modules

TerraFin's analytics package lives under `src/TerraFin/analytics/`. It is a mix
of:

- pure indicator functions used by the chart and agent APIs
- `TimeSeriesDataFrame` helpers for a few volatility transforms
- standalone analysis modules that are available from Python but not yet exposed
  as first-class interface pages or REST endpoints

The stable product-facing surface today is the chart overlay set plus the
agent-accessible technical indicators. DCF and GEX (options) are also
first-class β€” both have dedicated UI pages and REST endpoints. Portfolio
optimization, spectral helpers, and GBM simulation are usable from Python but
are still standalone or experimental from a UI/API perspective.

## Base utilities

`get_returns(df: TimeSeriesDataFrame)` in
`src/TerraFin/analytics/analysis/base_analytics.py` is the shared helper for
daily percentage returns with `NaN` rows removed.

## Technical analysis

The technical package is the most mature part of analytics. Most functions are
pure list-based helpers, which makes them easy to reuse from APIs, notebooks,
and adapter code.

### Core indicator contract

Most technical functions:

- accept `list[float]` input
- return an `offset` plus the computed values
- leave alignment to the caller, using `offset` to show how many leading points
  were consumed by the lookback window

| Function | Module | Signature | Returns |
|----------|--------|-----------|---------|
| `rsi` | `technical/rsi.py` | `rsi(closes, window=14)` | `(offset, values)` β€” offset = window + 1 |
| `macd` | `technical/macd.py` | `macd(closes, fast=12, slow=26, signal_window=9)` | `(offset, macd, signal, histogram)` β€” offset = slow - 1 |
| `moving_average` | `technical/ma.py` | `moving_average(closes, window)` | `(offset, values)` β€” offset = window - 1 |
| `bollinger_bands` | `technical/bollinger.py` | `bollinger_bands(closes, window=20, num_std=2.0)` | `(offset, upper, lower)` β€” offset = window - 1 |
| `realized_vol` | `technical/volatility.py` | `realized_vol(closes, window=21)` | `(offset, values)` β€” annualized, offset = window |
| `range_vol` | `technical/volatility.py` | `range_vol(highs, lows, window=20)` | `(offset, values)` β€” Parkinson's, offset = window - 1 |
| `trend_signal` | `technical/trend_signal.py` | `trend_signal(closes, window=126, distribution="normal", df=5)` | `(offset, values)` β€” Delta-Straddle signal in [-1, +1], offset = window + 1 |
| `trend_signal_composite` | `technical/trend_signal.py` | `trend_signal_composite(closes, windows=[32,64,126,252,504])` | `(offset, values)` β€” multi-timeframe averaged signal in [-1, +1] |
| `mandelbrot_fractal_dimension` | `technical/mandelbrot.py` | `mandelbrot_fractal_dimension(closes, window=65)` | `(offset, values)` β€” rolling path-complexity score in [1, 2], where lower is smoother / more fragile and higher is choppier / more anti-fragile. The function default is `window=65`; TerraFin's chart calls it with `window=130` explicitly and renders that line by default. Agent consumers can request 65, 130, and 260 explicitly. |
| `percentile_rank` | `technical/vol_regime.py` | `percentile_rank(values, window=126)` | `(offset, ranks)` β€” rolling min-max rank in [0, 100] |
| `vol_regime` | `technical/vol_regime.py` | `vol_regime(values, window=126, entry_threshold=20.0, exit_threshold=80.0)` | `(offset, regimes)` β€” 1=stable, 0=unstable with hysteresis |
| `lppl` | `technical/lppl.py` | `lppl(closes, n_windows=33, min_window=50, max_window=750, window_step=5, max_iter=45, seed=42)` | `LPPLResult` β€” confidence, full-series fit, qualifying sub-window fits. Pass `n_windows=None` to use the full article ladder (750β†’50 in 5-day steps). |

All module paths are relative to `src/TerraFin/analytics/analysis/`.

`technical/macd.py` also exposes `ema(values, span)` as a reusable helper.

### Spectral analysis

`technical/spectral.py` contains frequency-domain utilities for cycle analysis.
These are currently standalone helpers rather than chart overlays.

| Function | Purpose |
|----------|---------|
| `power_spectrum(closes, window_func="hanning")` | FFT periodogram of log returns |
| `dominant_cycles(closes, top_n=5, window_func="hanning")` | Highest-signal periodic cycles |
| `amplitude_phase(closes, window_func="hanning")` | Amplitude and phase per frequency |
| `spectral_filter(closes, min_period=2.0, max_period=inf)` | Band-pass filtering on returns |
| `spectrogram(closes, segment_size=64, overlap=48)` | Sliding-window time-frequency power map |

### TimeSeriesDataFrame wrappers

`technical/volatility.py` also exposes pandas-friendly wrappers:

| Function | Input | Output |
|----------|-------|--------|
| `realized_volatility(df, window_size=21)` | `TimeSeriesDataFrame` | `TimeSeriesDataFrame` |
| `range_volatility(df, window=20)` | `TimeSeriesDataFrame` | `TimeSeriesDataFrame` |

## Fundamental analysis

Fundamental analysis lives under `src/TerraFin/analytics/analysis/fundamental/`.
DCF now lives in the dedicated package `src/TerraFin/analytics/analysis/fundamental/dcf/`.

| Entry point | Purpose |
|-------------|---------|
| `build_sp500_dcf_payload()` | Build the S&P 500 valuation payload used by Market Insights |
| `build_stock_dcf_payload(ticker, overrides=None, projection_years=None)` | Build the stock valuation payload used by the Stock Analysis page. `overrides` (`StockDCFOverrides`) carries the FCF-base-source picker, turnaround inputs, and base value/growth/beta overrides. |
| `build_stock_reverse_dcf_payload(ticker, overrides=None, projection_years=5, growth_profile="early_maturity")` | Build the reverse DCF payload used by the Stock Analysis page |
| `build_sp500_template()` / `build_stock_template(ticker, overrides=None, projection_years=None)` | Build the underlying valuation templates before presentation |
| `_select_stock_fcf_base(quarter, annual, source="auto")` | Pick the base FCF/share by source. `source` ∈ `auto` / `3yr_avg` / `ttm` / `latest_annual`. `auto` cascade is `3yr_avg β†’ latest_annual β†’ ttm` (the professional default; see [Analytics Notes](./analytics-notes.md#base-fcf-source-cascade)). Returns `(value, selected_source)` where `selected_source` uses response-side strings: `3yr_avg`, `annual`, `quarterly_ttm`, or `missing` (when no candidate is available). |
| `_build_turnaround_schedule(...)` | Construct the explicit per-year FCF schedule for turnaround mode (linear interp from current FCF to breakeven; post-breakeven compound fading to terminal). |

DCF is exposed through the product and API endpoints:
- `GET /market-insights/api/dcf/sp500`
- `GET /stock/api/dcf?ticker=...&projectionYears=5|10|15` and `POST` for full overrides
- `GET /stock/api/reverse-dcf?ticker=...` and `POST` for overrides (`baseCashFlowPerShare`, `terminalGrowthPct`, `beta`, `equityRiskPremiumPct`, `currentPrice`, `projectionYears` 1–20, `growthProfile` `high_growth|early_maturity|fully_mature`)
- `GET /stock/api/fcf-history?ticker=...&years=10` β€” annual FCF/share series + the
  3yr-avg / latest-annual / TTM candidates the DCF would use, plus the source
  the `auto` cascade currently picks. Drives the FCF / Share History card and
  the FCF Base Source segmented control on the DCF input form.

Current DCF assumption notes now live in
[Analytics Notes](./analytics-notes.md).

## Risk analysis

Risk analysis lives under `src/TerraFin/analytics/analysis/risk/`.

| Entry point | Purpose |
|-------------|---------|
| `estimate_beta_5y_monthly(symbol)` | Compute TerraFin's default 5-year monthly regression beta |
| `estimate_beta_5y_monthly_adjusted(symbol)` | Compute the adjusted companion beta that shrinks toward `1.0` |
| `select_default_benchmark(symbol)` | Resolve the exchange-aware benchmark TerraFin uses for beta |

This package is currently Python-first, but `beta_5y_monthly` is now used as
the stock DCF and reverse DCF fallback when provider beta is unavailable.
Stock Analysis also exposes `GET /stock/api/beta-estimate?ticker=...` for the
manual beta-compute action in the DCF workbenches.

Beta-method and benchmark-mapping notes also live in
[Analytics Notes](./analytics-notes.md).

## Options analysis

Options analysis lives under `src/TerraFin/analytics/analysis/options/`.

| Entry point | Purpose |
|-------------|---------|
| `gamma_exposure.py` | Parse CBOE options chain, compute per-strike GEX in $B, zero-gamma strike, long/short gamma regime, call/put walls |
| `get_current_gex(ticker)` | High-level wrapper β€” returns a `GexPayload` dict with `available`, `spot_price`, `zero_gamma_strike`, `regime`, `total_gex_b`, `by_strike`, `by_expiration`, `largest_call_wall`, `largest_put_wall` |

GEX is now a first-class API feature. Per-ticker GEX is served by `/stock/api/gex?ticker=` and rendered in a panel on the Stock Analysis page. SPX-specific GEX is served by `/terminal/api/gex/spx` and `/terminal/api/gex/spx/history` and rendered as an accordion panel on the Market Insights page.

## Market data modules

`src/TerraFin/analytics/data/` contains data-fetching helpers used by analytics and market indicators.

| Module | Purpose |
|--------|---------|
| `spx_gex_history.py` | Fetch SqueezeMetrics DIX.csv, parse daily SPX GEX/$B and DIX ratio. 24h cache via PrivateDataService. On fetch failure, stale cached data is served if present; if the cache is empty, the caller receives an error. Used as the underlying source for the SPX GEX market indicator. |

## Portfolio optimization

Portfolio optimization lives under `src/TerraFin/analytics/analysis/portfolio/`.

| Entry point | Purpose |
|-------------|---------|
| `black_litterman(...)` | Run a Black-Litterman allocation workflow |
| `BLOutput` | Dataclass with prior/posterior returns and weights |

This is implemented as a standalone computation module rather than a UI feature.

## Pattern signals

`src/TerraFin/analytics/analysis/patterns/` is the systematic, rules-based
pattern catalog β€” the "do any of these named market conditions match the
latest bar?" surface. Where `technical/` exposes primitives (RSI value, MACD
line), `patterns/` evaluates whether a *named pattern* fires:
`CAPITULATION_BOTTOM`, `MA_GOLDEN_CROSS`, `WYCKOFF_SPRING`,
`52W_NEW_HIGH`, etc. Each pattern returns zero or more `Signal` objects
(`name`, `ticker`, `severity`, `message`, `snapshot`) and is stateless β€”
same input frame, same verdict.

This is the quantitative-investing layer of TerraFin: pattern-as-hypothesis,
backtested for forward-return edge, then wired into agent reports or pushed
through the realtime monitor.

### Pattern schools

Modules are split by methodology so a new pattern lands in an obvious file.

| School | Patterns |
|--------|----------|
| `trend` | 50/200 MA cross, MA50 cross, Minervini trend template, Faber 10-month TAA |
| `breakout` | Bollinger / Donchian (50, weekly 52) breakout, BB squeeze release, swing-pivot break, Darvas box, NR7 / Inside Bar, Keltner channel, 52-week high proximity, Wyckoff Spring / Upthrust |
| `meanrev` | RSI overbought / oversold, Connors RSI(2) dip in uptrend |
| `momentum` | MACD signal-line cross, Coppock curve (monthly) |
| `reversal` | Bull / bear engulfing at extreme, RSI ↔ price divergence |
| `volume` | Capitulation bottom (Wyckoff selling climax), OBV divergence, Chaikin Money Flow, Money Flow Index |

### Public API

```python
from TerraFin.analytics.analysis.patterns import evaluate, Signal

signals = evaluate("MOH", ohlc_df)  # list[Signal]
```

Every school module also exports its own `evaluate(ticker, ohlc)` if the
caller wants a narrower scan.

### Regime gates

A few patterns consult `spy_trend_ok(50)` from `patterns/_base.py` β€” a
day-cached "is SPY above its 50-day SMA" flag. Bullish-entry patterns
(`MINERVINI_TEMPLATE`, `52W_NEW_HIGH`) suppress fires when the broad
market is in primary downtrend. This was added after bear-period backtests
showed those patterns producing negative-edge fires across GFC 2008,
COVID 2020, and the 2022 bear.

### Pull vs push: same `Signal`, different trigger

`patterns/` is the **pull-driven** side: the agent flow, weekly reports,
or an ad-hoc backtest asks "evaluate every pattern on this frame now."
The **push-driven** flavor lives at `interface/infra/monitor/`: an external
realtime monitor service holds a broker WebSocket open, runs its own
intraday detectors, and POSTs each fired event to TerraFin. Both sides
emit the same `Signal` dataclass β€” only the trigger differs. See
[architecture.md](./architecture.md#signal-pipeline) for the pipeline
shape.

## Chart similarity search

`src/TerraFin/analytics/similarity/` β€” sliding-window template matching across a large stock universe.

| Module | Role |
|--------|------|
| `pool.py` | Universe loading, per-symbol EOY price cache, process-level pool TTL |
| `scorer.py` | STUMPY MASS distance profile, result ranking |

### Algorithm

1. **Target**: current close-price series for the query ticker, fetched via `DataFactory.get_recent_history()` (live, daily-TTL cache).
2. **Pool**: full price history through end of last year for every symbol in the universe, stored as immutable parquet files at `~/.terrafin/cache/prices/{symbol}_eoy{year}.parquet`.
3. **Transform**: both target and each pool subsequence are converted to cumulative log returns `log(p[t] / p[0])`, anchoring shape at 0 and removing trend / level bias.
4. **Distance**: STUMPY `mass()` computes the z-normalized Euclidean distance profile β€” O(n log n) per symbol β€” sliding the target template across the full history of each pool series.
5. **Score**: `max(0, 1 βˆ’ min_dist / √(2N))` where `√(2N)` is the theoretical maximum z-norm Euclidean distance for length-N sequences.

### Universes

| Name | Symbols |
|------|---------|
| `sp500` | ~501 S&P 500 constituents |
| `nasdaq100` | ~101 Nasdaq-100 constituents |
| `kospi200` | ~199 KOSPI 200 constituents |
| `sp500+nasdaq100+kospi200` | Union (~713 unique) β€” default |
| `watchlist` | User's current watchlist (not cached; fetched per request) |

### Cache behaviour

EOY parquet files are **immutable** β€” year-end data never changes, so no TTL is applied.  On first run the pool downloads all symbols (prints `[pool] Downloading {sym} ({i}/{total})...` to stdout).  Pool objects are held in a process-level dict with a 6-hour TTL to avoid re-loading 713 series per request.

### Python API

```python
from TerraFin.analytics.similarity.pool import get_pool
from TerraFin.analytics.similarity.scorer import score_pool

pool = get_pool("sp500+nasdaq100+kospi200")   # loads + caches full history
results = score_pool(target_series, pool.prices(), names=pool.names(), top_n=20)
# β†’ list[SimilarityResult(symbol, name, score, match_start, match_end, overlap_days)]
```

See `notebooks/analytics/chart_similarity_scan.ipynb` for an interactive walkthrough with visualization (target + historical match + 1-month after-move).

## Simulation

Simulation lives under `src/TerraFin/analytics/simulation/`.

| Entry point | Purpose |
|-------------|---------|
| `run_base_gbm(time_series_df, num_simulation=100, pred_ratio=0.2)` | Simulate price paths with geometric Brownian motion |

The simulation helper is available from Python and notebook workflows.

## Integration status

This is the quickest way to understand what is already connected to the product:

| Area | Status |
|------|--------|
| Chart auto-overlays | Stable |
| Agent API indicators | Stable |
| DCF | Stable on-demand UI/API feature in Market Insights and Stock Analysis |
| GEX (options) | Stable β€” `/stock/api/gex` per-ticker panel on Stock Analysis; `/terminal/api/gex/spx` SPX accordion panel on Market Insights |
| Portfolio optimization / GBM | Standalone, not yet first-class UI/API features |
| Risk beta toolkit | Partially integrated β€” used as the stock DCF fallback and exposed through the stock beta-estimate API |
| Trend signal (Delta-Straddle) | Stable β€” chart overlay and agent API |
| Mandelbrot Fractal Dimension | Stable β€” chart overlay and agent API |
| Vol regime (percentile rank + hysteresis) | Stable β€” chart overlay and agent API |
| LPPL (Bubble detection) | Calibrated default active in chart overlay and agent API; full article ladder remains available in the analytics helper for research/debug runs |
| Spectral analysis | Experimental helper |
| Chart similarity search | Stable agent API (`similarity_search`); notebook demo available |
| Notebook demos | Supported but manual-only, not product-critical regression coverage |

Notebook demos live in `notebooks/analytics/`.
They should stay as manual/exploratory notebooks, not `test_*.py` replacements.
Each demo notebook should use the same explicit `configure()` bootstrap pattern
described in [Getting Started](./getting-started.md) and
[Interface Overview](./interface.md) at the top of the first code cell.

## See also

- [feature-integration.md](./feature-integration.md) for the ownership rule when a new indicator or analysis becomes a public feature
- [data-layer.md](./data-layer.md) for the input types analytics consume
- [interface.md](./interface.md) for the chart and agent APIs that call these helpers