| --- |
| license: apache-2.0 |
| base_model: |
| - Qwen/Qwen3-32B |
| tags: |
| - forecasting |
| - prediction |
| - reinforcement-learning |
| - calibration |
| - polymarket |
| pipeline_tag: text-generation |
| --- |
| # Foresight V1 32B - Open-Source Forecasting Model |
| **Lightning Rod Labs** | [lightningrod.ai](https://lightningrod.ai/) |
|
|
| Foresight V1 32B is a forecasting model fine-tuned from Qwen3-32B via outcome-based RL. Despite being 10-100x smaller, it has **outperformed frontier models** on Brier score, ECE, and profitability. |
|
|
| Our latest model, Foresight V3, can be tested at [dashboard.lightningrod.ai](https://dashboard.lightningrod.ai/). |
|
|
| Lightning Rod Labs takes you from raw data to fine-tuned model. With automated training data generation, fine-tuning, and evaluation, all in one place. No manual labeling required. |
|
|
| ### 3rd Party Benchmarks π |
|
|
| Feb 2026: Foresight V1 32B ranked #1 on Prophet Arena Sports, a benchmark run by SIGMA Lab at UChicago, beating Grok-4, GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5 on live prediction questions. |
|
|
| Jan 2026: Foresight V1 32B is the [only non-frontier model in the top 5](https://forecastingresearch.substack.com/p/llms-are-closing-the-gap-on-human) on ForecastBench, an independent forecasting benchmark run by the Forecasting Research Institute, where AIs compete on real-world forecasting questions. |
|
|
| ## Key Results |
|
|
| Evaluated on August 25, 2025 against 251 live Polymarket questions, **Foresight-v1 outperformed every frontier model tested** on accuracy (Brier Score), calibration (ECE), and profitability. |
|
|
| <img src="image%207.png" width="1000"> |
|
|
| Further details on our methodology and results are available [here.](https://blog.lightningrod.ai/p/foresight-32b-beats-frontier-llms-on-live-polymarket-predictions) |
|
|
| ## How It Works |
|
|
| Foresight V1 32B was trained using outcome-based RL. The model was shown only information available at prediction time, forced to commit to a probability, and scored against the realized outcome using the Brier score as the reward signal. Confident wrong predictions were penalized more heavily than uncertain ones, directly incentivizing calibration over overconfidence. |
|
|
| Training data was generated using our Foresight Data platform, which automatically transformed unstructured sources into labeled training datasets β no human annotation required. |
|
|
| The same framework has been applied across domains to create prediction agents and domain expert models, including finance, healthcare, insurance, and sports analytics. |
|
|
| See: [LLMs Can Teach Themselves to Better Predict the Future](https://arxiv.org/abs/2502.05253) Β· [Outcome-based Reinforcement Learning to Predict the Future](https://arxiv.org/abs/2505.17989) Β· [Future-as-Label: Scalable Supervision from Real-World Outcomes](https://arxiv.org/abs/2601.06336) |
|
|
| ## Output Format |
|
|
| Our recommended usage is for predictions, but it also works with the OpenAI API. |
| <img src="image%206.png" width="600"> |
|
|
| ## About Lighting Rod Labs |
|
|
| Lightning Rod Labs takes you from raw data to fine-tuned model. With automated training data generation, fine-tuning, and evaluation, all in one place. No manual labeling required. Our research is peer-reviewed and published, including in Transactions on Machine Learning Research (TMLR). Our models have been benchmarked live and outperformed the world's best. |
|
|
| A few highlights: |
|
|
| - π #1 on ProphetArena Sport, beating GPT-5.2, Gemini 3 Pro, and Grok-4 (Feb 2026) |
| - π Top 5 on ForecastBench, outperforming Claude, O3, and Grok-4 (Jan 2026) |
| - π¬ Published in TMLR: 14B model matches o1 accuracy and generates >10% profit in live trading simulations [[link]](https://arxiv.org/abs/2505.17989) |
| - ποΈ Vetted and awardable for U.S. defense procurement via DARPA ERIS and CDAO Tradewinds marketplaces |
| - π° Featured in The Atlantic, TIME, and the Forecasting Research Institute |
|
|
| ## Contact |
|
|
| Interested in generating training data for your own models or building a custom prediction model? |
|
|
| - π§ [support@lightningrod.ai](mailto:support@lightningrod.ai) |
| - π
[Book a demo](https://calendly.com/d/ctq4-7gd-nyq/lightning-rod-demo) |
| - π [lightningrod.ai/about](https://www.lightningrod.ai/about) |
|
|
|
|
| ## License |
|
|
| [apache-2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |