| | --- |
| | license: apache-2.0 |
| | base_model: |
| | - Qwen/Qwen3-32B |
| | tags: |
| | - forecasting |
| | - prediction |
| | - reinforcement-learning |
| | - calibration |
| | - polymarket |
| | pipeline_tag: text-generation |
| | --- |
| | # Foresight V1 32B - Open-Source Forecasting Model |
| | **Lightning Rod Labs** | [lightningrod.ai](https://lightningrod.ai/) |
| |
|
| | Foresight V1 32B is a forecasting model fine-tuned from Qwen3-32B via outcome-based RL. Despite being 10-100x smaller, it has **outperformed frontier models** on Brier score, ECE, and profitability. |
| |
|
| | Our latest model, Foresight V3, can be tested at [dashboard.lightningrod.ai](https://dashboard.lightningrod.ai/). |
| |
|
| | Lightning Rod Labs takes you from raw data to fine-tuned model. With automated training data generation, fine-tuning, and evaluation, all in one place. No manual labeling required. |
| |
|
| | ### 3rd Party Benchmarks π |
| |
|
| | Feb 2026: Foresight V1 32B ranked #1 on Prophet Arena Sports, a benchmark run by SIGMA Lab at UChicago, beating Grok-4, GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5 on live prediction questions. |
| |
|
| | Jan 2026: Foresight V1 32B is the [only non-frontier model in the top 5](https://forecastingresearch.substack.com/p/llms-are-closing-the-gap-on-human) on ForecastBench, an independent forecasting benchmark run by the Forecasting Research Institute, where AIs compete on real-world forecasting questions. |
| |
|
| | ## Key Results |
| |
|
| | Evaluated on August 25, 2025 against 251 live Polymarket questions, **Foresight-v1 outperformed every frontier model tested** on accuracy (Brier Score), calibration (ECE), and profitability. |
| |
|
| | <img src="image%207.png" width="1000"> |
| |
|
| | Further details on our methodology and results are available [here.](https://blog.lightningrod.ai/p/foresight-32b-beats-frontier-llms-on-live-polymarket-predictions) |
| |
|
| | ## How It Works |
| |
|
| | Foresight V1 32B was trained using outcome-based RL. The model was shown only information available at prediction time, forced to commit to a probability, and scored against the realized outcome using the Brier score as the reward signal. Confident wrong predictions were penalized more heavily than uncertain ones, directly incentivizing calibration over overconfidence. |
| |
|
| | Training data was generated using our Foresight Data platform, which automatically transformed unstructured sources into labeled training datasets β no human annotation required. |
| |
|
| | The same framework has been applied across domains to create prediction agents and domain expert models, including finance, healthcare, insurance, and sports analytics. |
| |
|
| | See: [LLMs Can Teach Themselves to Better Predict the Future](https://arxiv.org/abs/2502.05253) Β· [Outcome-based Reinforcement Learning to Predict the Future](https://arxiv.org/abs/2505.17989) Β· [Future-as-Label: Scalable Supervision from Real-World Outcomes](https://arxiv.org/abs/2601.06336) |
| |
|
| | ## Output Format |
| |
|
| | Foresight-32B is OpenAI API-compatible. See [recommended usage](https://dashboard.lightningrod.ai/public/docs#tag/openai-compatible) for generating predictions. |
| | <img src="image%206.png" width="600"> |
| |
|
| | ## About Lighting Rod Labs |
| |
|
| | Lightning Rod Labs takes you from raw data to fine-tuned model. With automated training data generation, fine-tuning, and evaluation, all in one place. No manual labeling required. Our research is peer-reviewed and published, including in Transactions on Machine Learning Research (TMLR). Our models have been benchmarked live and outperformed the world's best. |
| |
|
| | A few highlights: |
| |
|
| | - π #1 on ProphetArena Sport, beating GPT-5.2, Gemini 3 Pro, and Grok-4 (Feb 2026) |
| | - π Top 5 on ForecastBench, outperforming Claude, O3, and Grok-4 (Jan 2026) |
| | - π¬ Published in TMLR: 14B model matches o1 accuracy and generates >10% profit in live trading simulations [[link]](https://arxiv.org/abs/2505.17989) |
| | - ποΈ Vetted and awardable for U.S. defense procurement via DARPA ERIS and CDAO Tradewinds marketplaces |
| | - π° Featured in The Atlantic, TIME, and the Forecasting Research Institute |
| |
|
| | ## Contact |
| |
|
| | Interested in generating training data for your own models or building a custom prediction model? |
| |
|
| | - π§ [support@lightningrod.ai](mailto:support@lightningrod.ai) |
| | - π
[Book a demo](https://calendly.com/d/ctq4-7gd-nyq/lightning-rod-demo) |
| | - π [lightningrod.ai/about](https://www.lightningrod.ai/about) |
| |
|
| |
|
| | ## License |
| |
|
| | [apache-2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |