Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -33,14 +33,3 @@ This approach has been used to beat frontier AIs 100x larger on prediction-marke
|
|
| 33 |
- **[Foresight-32B vs. Frontier LLMs](https://blog.lightningrod.ai/p/foresight-32b-beats-frontier-llms-on-live-polymarket-predictions)**: Live demonstration beating frontier models on Polymarket predictions.
|
| 34 |
|
| 35 |
Foresight-32B is consistently top-ranked on [ForecastBench](https://www.forecastbench.org/tournament/) and [ProphetArena](https://www.prophetarena.co/leaderboard), despite being 10x-100x smaller than frontier models.
|
| 36 |
-
|
| 37 |
-
---
|
| 38 |
-
|
| 39 |
-
## Models & Datasets
|
| 40 |
-
|
| 41 |
-
| Resource | Description |
|
| 42 |
-
|----------|-------------|
|
| 43 |
-
| [Trump-Forecaster](https://huggingface.co/LightningRodLabs/Trump-Forecaster) | RL-tuned gpt-oss-120b LoRA adapter for predicting Trump administration actions. Beats GPT-5 (Brier 0.194 vs 0.200). |
|
| 44 |
-
| [Golf-Forecaster](https://huggingface.co/LightningRodLabs/Golf-Forecaster) | RL-tuned gpt-oss-120b LoRA adapter for predicting professional golf outcomes. Beats GPT-5.1 (Brier 0.207 vs 0.218). |
|
| 45 |
-
| [WWTD-2025](https://huggingface.co/datasets/LightningRodLabs/WWTD-2025) | 2,790 binary forecasting questions about U.S. policy under the Trump administration, with news context and ground-truth resolutions. |
|
| 46 |
-
| [GolfForecasting](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) | 4,033 binary forecasting questions about professional golf across PGA Tour, LIV Golf, LPGA, and majors. |
|
|
|
|
| 33 |
- **[Foresight-32B vs. Frontier LLMs](https://blog.lightningrod.ai/p/foresight-32b-beats-frontier-llms-on-live-polymarket-predictions)**: Live demonstration beating frontier models on Polymarket predictions.
|
| 34 |
|
| 35 |
Foresight-32B is consistently top-ranked on [ForecastBench](https://www.forecastbench.org/tournament/) and [ProphetArena](https://www.prophetarena.co/leaderboard), despite being 10x-100x smaller than frontier models.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|