YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Fin-MoE Latent Encoder
A multimodal encoder that fuses Microstructure Physics (Level-3 order flow) with Macro-Sentiment (news/policy data) into a shared latent space. Designed as the intelligence head for 0DTE alpha generation.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Fin-MoE Latent Encoder β
β β
β ββββββββββββββββ ββββββββββββββββββββ β
β β Track A: β β Track B: β β
β β Micro-Flow β β Macro-Sentiment β β
β β Encoder β β Encoder β β
β β β β β β
β β L3 Tape βββΊ β β News/SEC βββΊ β β
β β MLOFI βββΊ β β Earnings βββΊ β β
β β ActiveDepthββΊβ β Policy βββΊ β β
β ββββββββ¬ββββββββ ββββββββββ¬βββββββββββ β
β β β β
β βΌ βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Financial Multi-Head Cross-Attention (FMHCA) β β
β β β’ Gated residual (tanh Ξ±, init=0) β β
β β β’ Text queries β Order flow K/V β β
β β β’ Multi-timescale positional encoding β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Sparse MoE Routing Layer β β
β β β’ 8 Expert FFNs, Top-2 routing β β
β β β’ Expert-Choice with load balance loss β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββ ββββββββββββββ ββββββββββββββββ β
β β L_MSE β β L_Direction β β L_Toxicity β β
β β Head β β Head β β Head β β
β ββββββββββ ββββββββββββββ ββββββββββββββββ β
β β
β Loss = Ξ±Β·L_MSE + Ξ²Β·L_Direction + Ξ³Β·L_Toxicity β
β (Sakuma DML Hybrid Loss with learnable weights) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Literature Basis
| Component | Reference | Key Insight |
|---|---|---|
| Gated Cross-Attention | Flamingo (Alayrac et al., 2022) | tanh-gated residual prevents gradient explosion when fusing pretrained encoders |
| MoE Routing | Expert Choice (Zhou et al., 2022) | Experts select top-k tokens (not tokensβexperts), eliminates load imbalance |
| Multi-Task Loss | Kendall et al. (2017) | Homoscedastic uncertainty as learnable loss weights |
| MLOFI | Xu et al. (2019) | Multi-level order flow imbalance across depth levels |
| Active Depth | Cont et al. (2014) | Queue-reactive models: kinetic energy from limit order dynamics |
Modules
fin_moe/feature_extractors.pyβ MLOFI, Active Depth, Market Temperature extractorsfin_moe/micro_encoder.pyβ Track A: Microstructure physics encoderfin_moe/macro_encoder.pyβ Track B: Macro-sentiment encoderfin_moe/cross_attention.pyβ Financial Multi-Head Cross-Attention (FMHCA)fin_moe/moe_layer.pyβ Sparse Mixture of Experts with Expert-Choice routingfin_moe/hybrid_loss.pyβ Sakuma DML hybrid loss (MSE + Direction + Toxicity)fin_moe/model.pyβ Full Fin-MoE Latent Encoderfin_moe/data_pipeline.pyβ Fire-Flyer pipeline: synthetic data generators + real data loaderstest_model.pyβ End-to-end test with synthetic datatrain.pyβ Training script
Installation
pip install torch transformers
Quick Start
from fin_moe.model import FinMoELatentEncoder
from fin_moe.data_pipeline import generate_synthetic_batch
model = FinMoELatentEncoder()
batch = generate_synthetic_batch(batch_size=8)
outputs = model(**batch)
print(f"Latent shape: {outputs['latent'].shape}")
print(f"Direction logits: {outputs['direction_logits'].shape}")
print(f"Return prediction: {outputs['return_pred'].shape}")
print(f"Toxicity prediction: {outputs['toxicity_pred'].shape}")
License
MIT
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support