YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Fin-MoE Latent Encoder

A multimodal encoder that fuses Microstructure Physics (Level-3 order flow) with Macro-Sentiment (news/policy data) into a shared latent space. Designed as the intelligence head for 0DTE alpha generation.

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Fin-MoE Latent Encoder                          β”‚
β”‚                                                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                β”‚
β”‚  β”‚  Track A:     β”‚              β”‚  Track B:         β”‚                β”‚
β”‚  β”‚  Micro-Flow   β”‚              β”‚  Macro-Sentiment  β”‚                β”‚
β”‚  β”‚  Encoder      β”‚              β”‚  Encoder          β”‚                β”‚
β”‚  β”‚              β”‚              β”‚                    β”‚                β”‚
β”‚  β”‚  L3 Tape ──► β”‚              β”‚  News/SEC ──►     β”‚                β”‚
β”‚  β”‚  MLOFI    ──► β”‚              β”‚  Earnings  ──►     β”‚                β”‚
β”‚  β”‚  ActiveDepth─►│              β”‚  Policy    ──►     β”‚                β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                β”‚
β”‚         β”‚                               β”‚                           β”‚
β”‚         β–Ό                               β–Ό                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β”‚
β”‚  β”‚  Financial Multi-Head Cross-Attention (FMHCA) β”‚                  β”‚
β”‚  β”‚  β€’ Gated residual (tanh Ξ±, init=0)            β”‚                  β”‚
β”‚  β”‚  β€’ Text queries ← Order flow K/V              β”‚                  β”‚
β”‚  β”‚  β€’ Multi-timescale positional encoding         β”‚                  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚
β”‚                     β”‚                                               β”‚
β”‚                     β–Ό                                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β”‚
β”‚  β”‚  Sparse MoE Routing Layer                     β”‚                  β”‚
β”‚  β”‚  β€’ 8 Expert FFNs, Top-2 routing               β”‚                  β”‚
β”‚  β”‚  β€’ Expert-Choice with load balance loss       β”‚                  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚
β”‚                     β”‚                                               β”‚
β”‚                     β–Ό                                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                     β”‚
β”‚  β”‚ L_MSE  β”‚  β”‚ L_Direction β”‚  β”‚ L_Toxicity   β”‚                     β”‚
β”‚  β”‚ Head   β”‚  β”‚ Head        β”‚  β”‚ Head         β”‚                     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                     β”‚
β”‚                                                                     β”‚
β”‚  Loss = Ξ±Β·L_MSE + Ξ²Β·L_Direction + Ξ³Β·L_Toxicity                    β”‚
β”‚  (Sakuma DML Hybrid Loss with learnable weights)                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Literature Basis

Component Reference Key Insight
Gated Cross-Attention Flamingo (Alayrac et al., 2022) tanh-gated residual prevents gradient explosion when fusing pretrained encoders
MoE Routing Expert Choice (Zhou et al., 2022) Experts select top-k tokens (not tokens→experts), eliminates load imbalance
Multi-Task Loss Kendall et al. (2017) Homoscedastic uncertainty as learnable loss weights
MLOFI Xu et al. (2019) Multi-level order flow imbalance across depth levels
Active Depth Cont et al. (2014) Queue-reactive models: kinetic energy from limit order dynamics

Modules

  • fin_moe/feature_extractors.py β€” MLOFI, Active Depth, Market Temperature extractors
  • fin_moe/micro_encoder.py β€” Track A: Microstructure physics encoder
  • fin_moe/macro_encoder.py β€” Track B: Macro-sentiment encoder
  • fin_moe/cross_attention.py β€” Financial Multi-Head Cross-Attention (FMHCA)
  • fin_moe/moe_layer.py β€” Sparse Mixture of Experts with Expert-Choice routing
  • fin_moe/hybrid_loss.py β€” Sakuma DML hybrid loss (MSE + Direction + Toxicity)
  • fin_moe/model.py β€” Full Fin-MoE Latent Encoder
  • fin_moe/data_pipeline.py β€” Fire-Flyer pipeline: synthetic data generators + real data loaders
  • test_model.py β€” End-to-end test with synthetic data
  • train.py β€” Training script

Installation

pip install torch transformers

Quick Start

from fin_moe.model import FinMoELatentEncoder
from fin_moe.data_pipeline import generate_synthetic_batch

model = FinMoELatentEncoder()
batch = generate_synthetic_batch(batch_size=8)
outputs = model(**batch)
print(f"Latent shape: {outputs['latent'].shape}")
print(f"Direction logits: {outputs['direction_logits'].shape}")
print(f"Return prediction: {outputs['return_pred'].shape}")
print(f"Toxicity prediction: {outputs['toxicity_pred'].shape}")

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support