HorizonSurge Ensemble

HorizonSurge Ensemble is a multimodal, "horizon-aware" machine learning ensemble designed to accurately forecast global migration volumes and trigger reliable early-warning alerts for mass-migration surges up to 6 months in advance.

It tracks data across 15 high-volume origin countries, ingesting monthly sequences of Legal Visa Issuances, Macroeconomic Exchange Rates, and NLP-Extracted News Sentiment Clusters.

Model Architecture

What makes this model unique is its Dynamic Horizon Weighting. Predicting a crisis 1 month away requires entirely different mathematical strengths than predicting a crisis 6 months away. The ensemble dynamically blends three underlying architectures:

  1. Tree-Ensemble (Random Forest): Exceptionally robust at broad surge envelope thresholding. Highly weighted for near-term (Lead 1-2) forecasting.
  2. PyTorch LSTM (with Custom SurgeJointLoss): Fused with categorical Country Embeddings (nn.Embedding), this recurrent network is trained on a custom Huber + BCE objective. It acts as the "Precision Guard," heavily penalizing false alarms.
  3. PyTorch Multi-Head Transformer: Superior at maintaining long-term sequential recall. Highly weighted for long-term (Lead 5-6) predictions to capture slow-moving crisis patterns that short-term architectures forget.

Performance Metrics (Out-of-Time Walk-Forward Validation)

Evaluated specifically on its ability to classify operational crisis surges (volumes $> 1.5$ standard deviations above the rolling mean):

Predictive Horizon Precision (False Alarm Guard) Recall (Miss Rate) F1-Score
Lead 1 (Next Month) 0.96 0.96 0.96
Lead 2 (2 Months Out) 0.93 0.96 0.95
Lead 3 (3 Months Out) 0.92 0.94 0.93
Lead 4 (4 Months Out) 0.88 0.94 0.91
Lead 5 (5 Months Out) 0.83 0.94 0.88
Lead 6 (6 Months Out) 0.80 0.92 0.86

Notice that even 6 months into the future, the Transformer-weighted backbone allows the ensemble to capture 92% of all major crises with an 80% precision rate.

How to Use

First, clone the repository and ensure you have torch, scikit-learn, numpy, and joblib installed.

Load the files using the MigrationSurgeEnsemble inference wrapper:

from inference import MigrationSurgeEnsemble

# 1. Initialize the ensemble (points to the directory containing the .pth and .joblib files)
predictor = MigrationSurgeEnsemble(models_dir=".")

# 2. Provide the rolling 6-month historical data for a specific country
# Format per month: [visa_volume, exchange_rate, news_sentiment_count]
# Array structure: [T-6, T-5, T-4, T-3, T-2, T-1 (Current)]
historical_scenario = [
    [15000, 19.5, 45],  # Lag 6
    [16000, 19.8, 52],  # Lag 5
    [18500, 19.9, 70],  # Lag 4
    [22000, 20.3, 85],  # Lag 3
    [24000, 20.5, 110], # Lag 2
    [31000, 21.0, 140]  # Lag 1
]

# 3. Generate 6-month forward projections
results = predictor.predict(country_name="Mexico", recent_6_months_data=historical_scenario)

print(results['Ensemble Prediction Volume'])
# Output: [36051.0, 38024.0, 41200.0, 43156.0, 44800.0, 41200.0]

Repository Structure Included

  • rf_lead_1.joblib -> rf_lead_6.joblib: The 6 independent Time Horizon Random Forest models.
  • lstm.pth: PyTorch weights for the Recurrent Architecture targeting extreme spikes.
  • transformer.pth: PyTorch weights for the Multi-Head Attention Architecture.
  • scaler_x.joblib, scaler_y.joblib: StandardScaler fits to ensure incoming user inference data identically matches the normalized training bounds.
  • country_map.json: Required dictionary mapping country names to categorical embedding IDs.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support