Wunder Fund LOB Predictorium β Causal Global-Norm GRU Ensemble (CPU/ONNX)
Streaming model for the Wunder Fund "LOB Fairy Predictorium" challenge: predict two
anonymized future price-movement targets (t0, t1) from sequences of Limit-Order-Book
states, under the official contract β 1 CPU core, 16 GB RAM, offline, β€60 min, scored by
Weighted Pearson Correlation (weights |target|, predictions clipped to [-6, 6]),
averaged over the two targets.
Portfolio project on the completed public competition, trained only on the provided data. All scores are local validation (
valid.parquet), not private-leaderboard results. Code: https://github.com/msrishav-28/wunder-fund-LOB-fairy-predictorium
Results (official scorer, full valid.parquet, 1,444 sequences)
| Solution | weighted_pearson | t0 | t1 |
|---|---|---|---|
| Provided baseline (vanilla GRU) | 0.2595 | 0.388 | 0.131 |
| This ensemble (per-target weighted blend) | 0.2846 | 0.4163 | 0.1528 |
+0.025 over baseline (+9.7%), β3Ο. Sequence-bootstrap 95% CI [0.271, 0.301]. Streaming inference reproduces the offline blend exactly; β1.1 ms/row on one core β β28 min for a 1,500-sequence test set.
Model
- Input: 32 raw LOB/trade features per step (
p0..p11bid/ask prices,v0..v11bid/ask volumes,dp0..dp3trade prices,dv0..dv3trade volumes). - Preprocessing: global (train-fixed) z-normalization β causal momentum features
[norm, lag1, delta, rolling-mean{5,10,20,40}](160 or 224 dims; built online, no future leak). - Backbone: ensemble of 10 unidirectional GRUs (2 layers, hidden 96β192), each exported
as a stateful one-step ONNX graph
(features, h0) β (prediction, h1). State resets per sequence. - Combination: per-target blend β cross-validated non-negative weights for
t0(which generalize), uniform average fort1(weight-fitting overfits the near-noise target). Weights live inensemble_config.json.
Files
| File | Purpose |
|---|---|
solution.py |
Streaming PredictionModel.predict(DataPoint) β the exact inference path |
ensemble_config.json |
Global mean/std + per-model ONNX names, input sizes, per-target weights |
*.onnx (Γ10) |
The stateful one-step GRU members |
utils.py |
Competition DataPoint + official scorer (unchanged) |
technical_report.md, RESULTS.md, FINDINGS.md |
Methodology, experiment ledger, data forensics |
Usage
# pip install onnxruntime numpy
# Files (solution.py, utils.py, ensemble_config.json, *.onnx) must sit in one folder.
from utils import DataPoint
from solution import PredictionModel
model = PredictionModel()
# Feed one DataPoint per step in chronological order; reset is automatic on seq_ix change.
# Warm-up steps (0..98) -> returns None; scored steps (99..999) -> np.ndarray shape (2,).
pred = model.predict(DataPoint(seq_ix=0, step_in_seq=99, need_prediction=True, state=state_32))
Key findings (why the score is shaped this way)
- t0 is a short-horizon, causally-learnable move (β0.42); a non-causal 3-step reconstruction reaches β0.75 corr, so most of the remaining gap is genuine future information.
- t1 is a long-horizon, near-noise target (causal ceiling β0.15) β this structurally caps the averaged metric. Microstructure feature engineering and t1-autocorrelation "tricks" added nothing (the latter only via target leakage, which is invalid at inference).
Limitations & honesty
Local-validation numbers only; the hidden test set is unavailable, so the published public-LB leader (0.3240) cannot be matched-or-claimed and sits above the bootstrap CI. Weighted correlation is a statistical score, not a claim of trading profit. No competition data is redistributed here.
License: MIT (code/weights). Trained solely on the competition-provided dataset.