Fin-MoE Latent Encoder

A multimodal encoder that fuses Microstructure Physics (Level-3 order flow) with Macro-Sentiment (news/policy data) into a shared latent space. Designed as the intelligence head for 0DTE alpha generation.

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                    Fin-MoE Latent Encoder                          │
│                                                                     │
│  ┌──────────────┐              ┌──────────────────┐                │
│  │  Track A:     │              │  Track B:         │                │
│  │  Micro-Flow   │              │  Macro-Sentiment  │                │
│  │  Encoder      │              │  Encoder          │                │
│  │              │              │                    │                │
│  │  L3 Tape ──► │              │  News/SEC ──►     │                │
│  │  MLOFI    ──► │              │  Earnings  ──►     │                │
│  │  ActiveDepth─►│              │  Policy    ──►     │                │
│  └──────┬───────┘              └────────┬──────────┘                │
│         │                               │                           │
│         ▼                               ▼                           │
│  ┌──────────────────────────────────────────────┐                  │
│  │  Financial Multi-Head Cross-Attention (FMHCA) │                  │
│  │  • Gated residual (tanh α, init=0)            │                  │
│  │  • Text queries ← Order flow K/V              │                  │
│  │  • Multi-timescale positional encoding         │                  │
│  └──────────────────┬───────────────────────────┘                  │
│                     │                                               │
│                     ▼                                               │
│  ┌──────────────────────────────────────────────┐                  │
│  │  Sparse MoE Routing Layer                     │                  │
│  │  • 8 Expert FFNs, Top-2 routing               │                  │
│  │  • Expert-Choice with load balance loss       │                  │
│  └──────────────────┬───────────────────────────┘                  │
│                     │                                               │
│                     ▼                                               │
│  ┌────────┐  ┌────────────┐  ┌──────────────┐                     │
│  │ L_MSE  │  │ L_Direction │  │ L_Toxicity   │                     │
│  │ Head   │  │ Head        │  │ Head         │                     │
│  └────────┘  └────────────┘  └──────────────┘                     │
│                                                                     │
│  Loss = α·L_MSE + β·L_Direction + γ·L_Toxicity                    │
│  (Sakuma DML Hybrid Loss with learnable weights)                   │
└─────────────────────────────────────────────────────────────────────┘

Literature Basis

Component	Reference	Key Insight
Gated Cross-Attention	Flamingo (Alayrac et al., 2022)	tanh-gated residual prevents gradient explosion when fusing pretrained encoders
MoE Routing	Expert Choice (Zhou et al., 2022)	Experts select top-k tokens (not tokens→experts), eliminates load imbalance
Multi-Task Loss	Kendall et al. (2017)	Homoscedastic uncertainty as learnable loss weights
MLOFI	Xu et al. (2019)	Multi-level order flow imbalance across depth levels
Active Depth	Cont et al. (2014)	Queue-reactive models: kinetic energy from limit order dynamics

Modules

fin_moe/feature_extractors.py — MLOFI, Active Depth, Market Temperature extractors
fin_moe/micro_encoder.py — Track A: Microstructure physics encoder
fin_moe/macro_encoder.py — Track B: Macro-sentiment encoder
fin_moe/cross_attention.py — Financial Multi-Head Cross-Attention (FMHCA)
fin_moe/moe_layer.py — Sparse Mixture of Experts with Expert-Choice routing
fin_moe/hybrid_loss.py — Sakuma DML hybrid loss (MSE + Direction + Toxicity)
fin_moe/model.py — Full Fin-MoE Latent Encoder
fin_moe/data_pipeline.py — Fire-Flyer pipeline: synthetic data generators + real data loaders
test_model.py — End-to-end test with synthetic data
train.py — Training script

Installation

pip install torch transformers

Quick Start

from fin_moe.model import FinMoELatentEncoder
from fin_moe.data_pipeline import generate_synthetic_batch

model = FinMoELatentEncoder()
batch = generate_synthetic_batch(batch_size=8)
outputs = model(**batch)
print(f"Latent shape: {outputs['latent'].shape}")
print(f"Direction logits: {outputs['direction_logits'].shape}")
print(f"Return prediction: {outputs['return_pred'].shape}")
print(f"Toxicity prediction: {outputs['toxicity_pred'].shape}")

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support