foresight-32B / README.md
kskotheim's picture
Update README.md
d99b011 verified
|
raw
history blame
3.51 kB
metadata
license: apache-2.0
base_model:
  - Qwen/Qwen3-32B

Foresight-32B

A 32-billion parameter language model fine-tuned for probabilistic forecasting of real-world events.

Overview

Foresight-32B is a general-purpose forecasting model developed by Lightning Rod Labs. Built on Qwen3-32B and trained using outcome-based reinforcement learning, it achieves state-of-the-art forecasting performance among open-weight models—outperforming frontier LLMs 10-100x its size on prediction market benchmarks.

Key Results

In a forward-looking evaluation on 251 live Polymarket questions (July-August 2025):

Model Brier Score ↓ ECE ↓ Profitable
Foresight-32B 0.199 6.0%
OpenAI o3 0.205 7.8%
Gemini 2.5 Pro 0.213 8.2%
Grok-4 0.218 9.1%
Claude Opus 0.221 8.9%
Qwen3-32B (base) 0.253 19.2%
Polymarket (market) 0.170

Foresight-32B led all tested LLMs on every metric: Brier score, expected calibration error (ECE), and profitability.

How It Works

See: LLMs Can Teach Themselves to Better Predict the Future See: Outcome-based Reinforcement Learning to Predict the Future

Synthetic Training Data (Foresight Learning)

We augment limited real-world prediction market data with synthetically generated forecasting questions using our data generation framework. This generates questions from streams of data (e.g., news articles) that are difficult to predict at one point in time but verifiable later. The model was trained on ~10,000 real Polymarket questions plus ~100,000 synthetic questions—with nearly 70% of training data being synthetic.

Training Details

  • Base Model: Qwen3-32B
  • Training Method: GRPO
  • Training Data: ~10k Polymarket questions + ~100k synthetic forecasting questions
  • Evaluation: Held-out test set of 1,265 questions with temporal separation to prevent leakage

Usage

Foresight-32B is available for use at dashboard.lightningrod.ai.

Input Format

The model accepts a forecasting question along with relevant context (news articles, background information) and outputs a probability estimate with reasoning. Include instructions for how the answer should be formatted for a well structured response.

Question: Will [event] happen by [date]?

Context:
[Relevant news headlines and information up to prediction date]

Output: Probability estimate (0-100%) with reasoning

Citation

If you use Foresight-32B in your research, please cite:

@article{turtel2025outcome,
  title={Outcome-based Reinforcement Learning to Predict the Future},
  author={Turtel, Benjamin and others},
  journal={arXiv preprint arXiv:2505.17989},
  year={2025}
}

@article{turtel2025llms,
  title={LLMs Can Teach Themselves to Better Predict the Future},
  author={Turtel, Benjamin and Franklin, Danny and Schoenegger, Philipp},
  journal={arXiv preprint arXiv:2502.05253},
  year={2025}
}

Contact

If you are interested in generating training data for your own models or fine-tuning custom prediction agents on your domain-specific data, reach out to support@lightningrod.ai.

License

apache-2.0