Golf-Forecaster / README.md
Bturtel's picture
Broaden scope description in README
87d7f5d verified
---
language:
- en
license: apache-2.0
library_name: peft
tags:
- forecasting
- prediction
- reinforcement-learning
- grpo
- lora
- mixture-of-experts
- golf
- sports
- future-as-label
datasets:
- LightningRodLabs/GolfForecasting
base_model: openai/gpt-oss-120b
pipeline_tag: text-generation
model-index:
- name: Golf-Forecaster
results:
- task:
type: text-generation
name: Probabilistic Forecasting
dataset:
name: GolfForecasting
type: LightningRodLabs/GolfForecasting
split: test
metrics:
- type: brier_score
value: 0.207
name: Brier Score
- type: ece
value: 0.062
name: Expected Calibration Error
---
# Golf-Forecaster
**LoRA adapter** for [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b), RL-tuned to predict professional golf outcomes — tournament winners, cuts, matchups, majors, team events, season races, world rankings, and player milestones across every major tour. Trained on 3,178 binary forecasting questions from [GolfForecasting](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) using the [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk). Beats GPT-5.
[Dataset](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) · [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk) · [Future-as-Label paper](https://arxiv.org/abs/2601.06336) · [Outcome-based RL paper](https://arxiv.org/abs/2505.17989)
---
## Results
Evaluated on 855 held-out test questions (temporal split, Aug 2025+).
| Model | Brier Score | Brier Skill Score | ECE |
|-------|:---:|:---:|:---:|
| **Golf-Forecaster** | **0.207** | **+17.0%** | **0.062** |
| gpt-oss-120b (base) | 0.218 | +12.8% | 0.083 |
| GPT-5 | 0.218 | +12.8% | 0.106 |
![Brier Skill Score](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting/resolve/main/brier_skill_score.png)
![Brier Score Comparison](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting/resolve/main/brier_score_comparison.png)
![ECE Comparison](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting/resolve/main/ece_comparison.png)
**Brier Score**: Mean squared error between predicted probability and outcome. Lower is better. **BSS** measures improvement over always predicting the base rate. **ECE**: Whether predicted probabilities match actual frequencies. Lower is better.
---
## Training
- **Base model**: [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (120B MoE, 5.1B active params)
- **Method**: GRPO with Brier score reward via [Tinker](https://tinker.computer)
- **LoRA rank**: 32, learning rate 4e-5, batch size 32, group size 8, 100 steps
---
## Usage
The adapter uses Tinker's module naming convention, so it requires a merge step before inference. A standalone `merge.py` script is included.
### Merge into full model
```bash
pip install torch transformers safetensors tqdm huggingface-hub
python merge.py --output ./golf-forecaster-merged
```
### Inference
```python
import sglang as sgl
engine = sgl.Engine(
model_path="./golf-forecaster-merged",
tokenizer_path="openai/gpt-oss-120b",
trust_remote_code=True,
dtype="bfloat16",
tp_size=2,
)
news_context = "... relevant news articles ..."
prompt = f"""You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".
Question: Will Scottie Scheffler win the 2025 Masters?
Context:
{news_context}
Respond with your reasoning, then give your final answer as a probability between 0 and 1 inside <answer></answer> tags."""
output = engine.generate(prompt, sampling_params={"max_new_tokens": 4096, "stop": ["</answer>"]})
print(output["text"])
```
---
## Links
- **Dataset**: [LightningRodLabs/GolfForecasting](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting)
- **Training platform**: [Tinker](https://tinker.computer)
- **Data generation**: [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk)
- **Future-as-Label paper**: [arxiv:2601.06336](https://arxiv.org/abs/2601.06336)
- **Outcome-based RL paper**: [arxiv:2505.17989](https://arxiv.org/abs/2505.17989)