Upload folder using huggingface_hub

Browse files

Files changed (12) hide show

README.md +82 -0
country_map.json +1 -0
lstm.pth +3 -0
rf_lead_1.joblib +3 -0
rf_lead_2.joblib +3 -0
rf_lead_3.joblib +3 -0
rf_lead_4.joblib +3 -0
rf_lead_5.joblib +3 -0
rf_lead_6.joblib +3 -0
scaler_x.joblib +3 -0
scaler_y.joblib +3 -0
transformer.pth +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+language:
+- en
+tags:
+- tabular
+- time-series
+- forecasting
+- ensemble
+- pytorch
+- scikit-learn
+- migration
+- early-warning
+license: mit
+---
+# HorizonSurge Ensemble
+**HorizonSurge Ensemble** is a multimodal, "horizon-aware" machine learning ensemble designed to accurately forecast global migration volumes and trigger reliable early-warning alerts for mass-migration surges up to 6 months in advance.
+It tracks data across 15 high-volume origin countries, ingesting monthly sequences of **Legal Visa Issuances**, **Macroeconomic Exchange Rates**, and **NLP-Extracted News Sentiment Clusters**.
+## Model Architecture
+What makes this model unique is its **Dynamic Horizon Weighting**. Predicting a crisis 1 month away requires entirely different mathematical strengths than predicting a crisis 6 months away. The ensemble dynamically blends three underlying architectures:
+1. **Tree-Ensemble (Random Forest):** Exceptionally robust at broad surge envelope thresholding. Highly weighted for near-term (Lead 1-2) forecasting.
+2. **PyTorch LSTM (with Custom `SurgeJointLoss`):** Fused with categorical Country Embeddings (`nn.Embedding`), this recurrent network is trained on a custom Huber + BCE objective. It acts as the "Precision Guard," heavily penalizing false alarms.
+3. **PyTorch Multi-Head Transformer:** Superior at maintaining long-term sequential recall. Highly weighted for long-term (Lead 5-6) predictions to capture slow-moving crisis patterns that short-term architectures forget.
+## Performance Metrics (Out-of-Time Walk-Forward Validation)
+Evaluated specifically on its ability to classify operational crisis surges (volumes $> 1.5$ standard deviations above the rolling mean):
+| Predictive Horizon | Precision (False Alarm Guard) | Recall (Miss Rate) | F1-Score |
+| :--- | :--- | :--- | :--- |
+| **Lead 1 (Next Month)** | 0.96 | 0.96 | **0.96** |
+| **Lead 2 (2 Months Out)** | 0.93 | 0.96 | **0.95** |
+| **Lead 3 (3 Months Out)** | 0.92 | 0.94 | **0.93** |
+| **Lead 4 (4 Months Out)** | 0.88 | 0.94 | **0.91** |
+| **Lead 5 (5 Months Out)** | 0.83 | 0.94 | **0.88** |
+| **Lead 6 (6 Months Out)** | 0.80 | 0.92 | **0.86** |
+*Notice that even 6 months into the future, the Transformer-weighted backbone allows the ensemble to capture 92% of all major crises with an 80% precision rate.*
+## How to Use
+First, clone the repository and ensure you have `torch`, `scikit-learn`, `numpy`, and `joblib` installed.
+Load the files using the `MigrationSurgeEnsemble` inference wrapper:
+```python
+from inference import MigrationSurgeEnsemble
+# 1. Initialize the ensemble (points to the directory containing the .pth and .joblib files)
+predictor = MigrationSurgeEnsemble(models_dir=".")
+# 2. Provide the rolling 6-month historical data for a specific country
+# Format per month: [visa_volume, exchange_rate, news_sentiment_count]
+# Array structure: [T-6, T-5, T-4, T-3, T-2, T-1 (Current)]
+historical_scenario = [
+    [15000, 19.5, 45],  # Lag 6
+    [16000, 19.8, 52],  # Lag 5
+    [18500, 19.9, 70],  # Lag 4
+    [22000, 20.3, 85],  # Lag 3
+    [24000, 20.5, 110], # Lag 2
+    [31000, 21.0, 140]  # Lag 1
+]
+# 3. Generate 6-month forward projections
+results = predictor.predict(country_name="Mexico", recent_6_months_data=historical_scenario)
+print(results['Ensemble Prediction Volume'])
+# Output: [36051.0, 38024.0, 41200.0, 43156.0, 44800.0, 41200.0]
+```
+## Repository Structure Included
+* `rf_lead_1.joblib` -> `rf_lead_6.joblib`: The 6 independent Time Horizon Random Forest models.
+* `lstm.pth`: PyTorch weights for the Recurrent Architecture targeting extreme spikes.
+* `transformer.pth`: PyTorch weights for the Multi-Head Attention Architecture.
+* `scaler_x.joblib`, `scaler_y.joblib`: StandardScaler fits to ensure incoming user inference data identically matches the normalized training bounds.
+* `country_map.json`: Required dictionary mapping country names to categorical embedding IDs.

country_map.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"Algeria": 0, "Antigua and Barbuda": 1, "Armenia, Republic of": 2, "Australia": 3, "Austria": 4, "Bahamas, The": 5, "Bahrain, Kingdom of": 6, "Belgium": 7, "Belize": 8, "Bolivia": 9, "Brazil": 10, "Bulgaria": 11, "Burundi": 12, "Cameroon": 13, "Canada": 14, "Central African Republic": 15, "Chile": 16, "China, People's Republic of": 17, "Colombia": 18, "Congo, Democratic Republic of the": 19, "Costa Rica": 20, "Croatia, Republic of": 21, "Cyprus": 22, "Czech Republic": 23, "C\u00f4te d'Ivoire": 24, "Denmark": 25, "Dominica": 26, "Dominican Republic": 27, "Equatorial Guinea, Republic of": 28, "Euro Area (EA)": 29, "Fiji, Republic of": 30, "Finland": 31, "France": 32, "Gabon": 33, "Gambia, The": 34, "Georgia": 35, "Germany": 36, "Ghana": 37, "Greece": 38, "Grenada": 39, "Guyana": 40, "Hungary": 41, "Iceland": 42, "Iran, Islamic Republic of": 43, "Ireland": 44, "Israel": 45, "Italy": 46, "Japan": 47, "Latvia, Republic of": 48, "Lesotho, Kingdom of": 49, "Luxembourg": 50, "Malawi": 51, "Malaysia": 52, "Malta": 53, "Mexico": 54, "Moldova, Republic of": 55, "Morocco": 56, "Netherlands, The": 57, "New Zealand": 58, "Nicaragua": 59, "Nigeria": 60, "North Macedonia, Republic of": 61, "Norway": 62, "Pakistan": 63, "Papua New Guinea": 64, "Paraguay": 65, "Philippines": 66, "Poland, Republic of": 67, "Portugal": 68, "Romania": 69, "Russian Federation": 70, "Samoa": 71, "Saudi Arabia": 72, "Sierra Leone": 73, "Singapore": 74, "Slovak Republic": 75, "Solomon Islands": 76, "South Africa": 77, "Spain": 78, "St. Kitts and Nevis": 79, "St. Lucia": 80, "St. Vincent and the Grenadines": 81, "Sweden": 82, "Switzerland": 83, "Togo": 84, "Trinidad and Tobago": 85, "Tunisia": 86, "Uganda": 87, "Ukraine": 88, "United Kingdom": 89, "United States": 90, "Uruguay": 91, "Zambia": 92}

lstm.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4b569ea507eb271ccf1c3c392d62bebdf5aef7ea2040668e4629c0e2464c816d
+size 219391

rf_lead_1.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3865ef55febe1dfb90f102a7d995a70db78502d7df7424efd652d984c6bec28a
+size 394497

rf_lead_2.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ee68a7de12bd37e665449d6bda8ad8c131e37e91c2a1c159fdf5bffc24107181
+size 381249

rf_lead_3.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0061a06681276e160a287393a14b0d272526ee7213f167a52686a60bfac58758
+size 390465

rf_lead_4.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a74aafdba598b17775f975c9fb16ae335f75aa9119d75bba99ac57d7aeb35fc
+size 395937

rf_lead_5.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b2f3e1b3c725e234607b53365c899fc983e552fd3cd104d0e346eb4e9beb99cc
+size 393777

rf_lead_6.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ddc972504c4f3e9435b884d87f967bd070ad7d74a720af024034152460ec0360
+size 396513

scaler_x.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8aae0ef4f4fefd7ece477ac1b74ee0d011eb838e6f1c3b9bbabe48a374912478
+size 639

scaler_y.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bf29529e2fb2e2a990d3524241e9f3d0f10c431155cc32a221ebf94ed262065b
+size 727

transformer.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aedf8a0d10caa57b66526d06b0ca8b2e3110f6879fb3cfff301e43a0a969b5a1
+size 2264536