Abstract
Large language models can be trained to produce calibrated probabilistic forecasts for supply chain disruptions, outperforming existing baselines and enabling decision-ready predictions through domain-specific adaptation.
Anticipating supply chain disruptions before they materialize is a core challenge for firms and policymakers alike. A key difficulty is learning to reason reliably about infrequent, high-impact events from noisy and unstructured inputs - a setting where general-purpose models struggle without task-specific adaptation. We introduce an end-to-end framework that trains LLMs to produce calibrated probabilistic forecasts using realized disruption outcomes as supervision. The resulting model substantially outperforms strong baselines - including GPT-5 - on accuracy, calibration, and precision. We also show that training induces more structured and reliable probabilistic reasoning without explicit prompting. These results suggest a general pathway for training domain-specific forecasting models that produce decision-ready signals. To support transparency we open-source the evaluation dataset used in this study. Dataset: https://huggingface.co/datasets/LightningRodLabs/supply-chain-predictions
Community
We train an LLM to forecast supply chain disruptions from news using Foresight Learning, an RL-based framework that supervises probabilistic predictions with realized outcomes. Our fine-tuned model outperforms GPT-5 across all metrics on a held-out test set covering 25 countries and 88 product categories — with Precision@10% improving from 8.7% to 34.8%. We further show that training under a forecasting objective induces structured probabilistic reasoning without explicit prompting: base-rate anchoring, statistical modeling, and iterative uncertainty refinement emerge spontaneously, with the model learning to think about prediction rather than just pattern-match to an answer. These results suggest a general pathway for training domain-specific forecasting models wherever predictive signal exists in unstructured text. We open-source the evaluation dataset.
Get this paper in your agent:
hf papers read 2604.01298 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper