Update README.md

#4
by gretcheny - opened
Files changed (1) hide show
  1. README.md +2 -9
README.md CHANGED
@@ -29,15 +29,7 @@ Jan 2026: Foresight V1 32B is the [only non-frontier model in the top 5](https:/
29
 
30
  Evaluated on August 25, 2025 against 251 live Polymarket questions, **Foresight-v1 outperformed every frontier model tested** on accuracy (Brier Score), calibration (ECE), and profitability.
31
 
32
- | Model | Brier Score ↓ | ECE ↓ | Profitable |
33
- |-------|---------------|-------|------------|
34
- | **Foresight V1 32B** | **0.199** | **6.0%** | βœ“ |
35
- | OpenAI o3 | 0.205 | 7.8% | βœ“ |
36
- | Gemini 2.5 Pro | 0.213 | 8.2% | βœ— |
37
- | Grok-4 | 0.218 | 9.1% | βœ— |
38
- | Claude Opus | 0.221 | 8.9% | βœ— |
39
- | Qwen3-32B (base) | 0.253 | 19.2% | βœ— |
40
- | Polymarket (market) | 0.170 | β€” | β€” |
41
 
42
  Further details on our methodology and results are available [here.](https://blog.lightningrod.ai/p/foresight-32b-beats-frontier-llms-on-live-polymarket-predictions)
43
 
@@ -54,6 +46,7 @@ See: [LLMs Can Teach Themselves to Better Predict the Future](https://arxiv.org/
54
  ## Output Format
55
 
56
  Our recommended usage is for predictions, but it also works with the OpenAI API.
 
57
 
58
  ## About Lighting Rod Labs
59
 
 
29
 
30
  Evaluated on August 25, 2025 against 251 live Polymarket questions, **Foresight-v1 outperformed every frontier model tested** on accuracy (Brier Score), calibration (ECE), and profitability.
31
 
32
+ <img src="image%207.png" width="1000">
 
 
 
 
 
 
 
 
33
 
34
  Further details on our methodology and results are available [here.](https://blog.lightningrod.ai/p/foresight-32b-beats-frontier-llms-on-live-polymarket-predictions)
35
 
 
46
  ## Output Format
47
 
48
  Our recommended usage is for predictions, but it also works with the OpenAI API.
49
+ <img src="image%206.png" width="600">
50
 
51
  ## About Lighting Rod Labs
52