Total Return Prediction - Hurricanes
The Climate Index AI LSTM model is designed to predict the total returns of commercial real estate (CRE) investments by incorporating climate-related risks, such as extreme temperatures and hurricane data, alongside financial indicators like interest rates and inflation. The model forecasts property values in 138 Core-Based Statistical Areas (CBSA) over a 12-quarter forecast horizon (3 years).
Model Details
Model Description
This model leverages machine learning (ML), specifically a Long-Short-Term Memory (LSTM) neural network, to capture the complex relationships between climate variables, economic conditions, and property valuations. The inclusion of hurricane-specific features like wind speed, damage, and category provides additional insights into how extreme weather events impact commercial real estate returns.
- Developed by: Climate Index AI
- Model type: LSTM neural network
- License: Private Proprietary License (contact Climate Index AI Inc. for access)
Model Sources
- Repository: https://huggingface.co/climateindexai/total_return_hurricanes
- Paper: TBD
Uses
Direct Use
The model can be used to assess the impact of climate and economic factors on commercial real estate (CRE) returns, supporting investors in portfolio management and regional investment planning.
Downstream Use
The model can be integrated with other forecasting tools for broader financial analysis, particularly for regions prone to hurricanes and extreme weather events.
Out-of-Scope Use
This model is not for high-frequency trading, or precise short-term valuations.
Bias, Risks, and Limitations
The model is limited by the scope of its training data, which primarily focuses on U.S. CBSA regions. As such, it may not generalize well to international markets. Additionally, extreme and unprecedented climate events outside historical patterns may affect its accuracy.
Recommendations
While the Climate Index AI LSTM model provides valuable insights into the potential impacts of climate and economic factors on commercial real estate returns, users should exercise caution due to certain inherent limitations:
Local vs. Global Climate Changes
Since climate change effects can vary significantly by region, the model may have varying accuracy across different CBSAs. Users are encouraged to consider local climate conditions and risks that may not be well-represented in historical data when interpreting model results.
Sensitivity to Macroeconomic Changes
The model includes financial indicators such as interest rates and inflation but may not account for sudden, significant economic shocks (e.g., rapid inflation spikes or economic downturns). Users should pair model insights with ongoing economic assessments to better gauge the model’s predictions.
Complementary Use with Expert Judgment
The model’s predictions should be considered part of a broader decision-making process. It’s advisable to incorporate expert judgment and complementary climate or economic analyses, especially for high-stakes investments or in regions with volatile climate trends.
How to Get Started with the Model
Use the code below to get started with the model.
from tensorflow.keras.models import load_model
model = load_model('total_return_hurricanes_model.keras')
Training Details
The Climate Index AI LSTM model leverages a comprehensive dataset that integrates historical climate, economic, and real estate data. This structure allows the model to capture the complex relationships between climate variables and financial conditions influencing property returns.
Training Data
The dataset used for this model comprises three main types of information, organized in quarterly time steps spanning from 1981 to 2023 across 138 Core-Based Statistical Areas (CBSAs). Each data point includes:
- Climate Data: We use temperature and precipitation data from the PRISM dataset, which provides high-resolution historical weather data, including metrics on extreme temperatures and long-term climate trends. The data was compiled by counting high and low temperatures over the past 24 quarters, starting from the specified quarter.
- Hurricane Data: For quarters impacted by hurricanes, the dataset includes wind speed, damage cost, pressure, and hurricane category to account for the influence of extreme weather events.
- Financial Indicators: Interest rate and inflation data are sourced from the Federal Reserve Economic Data (FRED) repository, a comprehensive resource for historical U.S. economic metrics.
- Commercial Real Estate Data: Total return data for commercial properties is sourced from the NECRIF NPI dataset, which provides historical financial performance metrics for commercial real estate in various U.S. regions.
The combined dataset enables the model to make regional-level predictions based on various influencing factors, allowing localized forecasts that account for region-specific climate risks and economic conditions.
Training Procedure
Preprocessing
Time-series data were processed with a lookback window of 48 quarters (12 years) to capture long-term dependencies. Data normalization was applied using MinMax scaling.
- Sequence Creation: Each input sequence spans a lookback window of 48 quarters (12 years), capturing both short-term fluctuations and long-term trends.
- Normalization: MinMax scaling transforms each feature to a range between 0 and 1, stabilizing model training and ensuring proportional contributions from all features.
- Model Architecture: The model uses an LSTM neural network architecture, with two stacked LSTM layers (70 and 35 units), and dropout and L2 regularization to reduce overfitting risks.
- Training and Testing Split: The dataset is split into training and testing sets (80-20). Early stopping prevents overfitting, halting training if validation loss does not improve.
Training Hyperparameters
- Training regime: fp32
- Batch processing: Enabled, with early stopping and learning rate scheduling.
- Optimizer: Adam with gradient clipping (clip value = 1.0).
- Loss function: Mean Squared Error (MSE).
Evaluation
Testing Data, Factors & Metrics
Testing Data
Testing was conducted on an 80-20 train-test split, with early stopping implemented to prevent overfitting.
Metrics and Results
Root Mean Squared Error (RMSE) was used, achieving an RMSE of 0.001, highlighting the model’s efficiency in balancing prediction accuracy with computational resources.
Environmental Impact
Experiments were conducted using Google Cloud Platform in the US-central1 region, which has a carbon efficiency of 1.71 kgCO$_2$eq . A total of 10 hours of computation were performed on an RTX 4090 GPU (TDP of 300W), resulting in estimated emissions of 1.71 kgCO$_2$eq. These emissions were fully offset by Google Cloud Platform, meaning there was no net carbon impact from these experiments.
Estimations were conducted using the MachineLearning Impact calculator.
\usepackage{hyperref}
\subsection{CO2 Emission Related to Experiments}
Experiments were conducted using Google Cloud Platform in region us-central1, which has a carbon efficiency of 0.57 kgCO$_2$eq/kWh. A cumulative of 10 hours of computation was performed on hardware of type RTX 4090 (TDP of 300W).
Total emissions are estimated to be 1.71 kgCO$_2$eq of which 100 percents were directly offset by the cloud provider.
Estimations were conducted using the \href{https://mlco2.github.io/impact#compute}{MachineLearning Impact calculator} presented in \cite{lacoste2019quantifying}.
@article{lacoste2019quantifying,
title={Quantifying the Carbon Emissions of Machine Learning},
author={Lacoste, Alexandre and Luccioni, Alexandra and Schmidt, Victor and Dandres, Thomas},
journal={arXiv preprint arXiv:1910.09700},
year={2019}
}
Glossary
- LSTM (Long Short-Term Memory): a type of recurrent neural network effective for time-series forecasting.
- CBSA (Core-Based Statistical Area): A U.S. Census Bureau-defined region consisting of one or more counties anchored by a large population center. CBSAs are widely used in economic and demographic analysis as they reflect metropolitan and micropolitan areas.
- CRE (Commercial Real Estate): Property primarily used for business purposes, including office spaces, retail, industrial buildings, and multifamily housing.
- MinMax Scaling: A preprocessing technique that transforms data by scaling each feature to a specified range, typically [0, 1]. This approach is common in ML pipelines to enhance model performance.
- NECRIF (National Council of Real Estate Investment Fiduciaries): A nonprofit organization providing performance data for commercial real estate in the U.S., often through its National Property Index (NPI).
- NPI (National Property Index): An index from NECRIF that tracks the financial performance of a large sample of commercial real estate properties across the U.S. It is a widely used benchmark for CRE investment performance.
- PRISM (Parameter-elevation Regressions on Independent Slopes Model): A climate dataset that offers high-resolution historical weather data, widely used in environmental research.
- RMSE (Root Mean Squared Error): A metric used to measure the accuracy of predictions in regression tasks. RMSE represents the square root of the average squared difference between predicted and actual values, with lower values indicating better performance.
Model Card Contact
- Vitor Barros - vitor@climateindex.ai