language:
- en
license: apache-2.0
library_name: peft
tags:
- forecasting
- prediction
- reinforcement-learning
- grpo
- lora
- mixture-of-experts
- golf
- sports
- future-as-label
datasets:
- LightningRodLabs/GolfForecasting
base_model: openai/gpt-oss-120b
pipeline_tag: text-generation
model-index:
- name: Golf-Forecaster
results:
- task:
type: text-generation
name: Probabilistic Forecasting
dataset:
name: GolfForecasting
type: LightningRodLabs/GolfForecasting
split: test
metrics:
- type: brier_score
value: 0.207
name: Brier Score
- type: ece
value: 0.062
name: Expected Calibration Error
Golf-Forecaster
LoRA adapter for gpt-oss-120b, RL-tuned to predict professional golf outcomes — tournament winners, cuts, matchups, majors, team events, season races, world rankings, and player milestones across every major tour. Trained on 3,178 binary forecasting questions from GolfForecasting using the Lightning Rod SDK. Beats GPT-5.
Dataset · Lightning Rod SDK · Future-as-Label paper · Outcome-based RL paper
Results
Evaluated on 855 held-out test questions (temporal split, Aug 2025+).
| Model | Brier Score | Brier Skill Score | ECE |
|---|---|---|---|
| Golf-Forecaster | 0.207 | +17.0% | 0.062 |
| gpt-oss-120b (base) | 0.218 | +12.8% | 0.083 |
| GPT-5 | 0.218 | +12.8% | 0.106 |
Brier Score: Mean squared error between predicted probability and outcome. Lower is better. BSS measures improvement over always predicting the base rate. ECE: Whether predicted probabilities match actual frequencies. Lower is better.
Training
- Base model: openai/gpt-oss-120b (120B MoE, 5.1B active params)
- Method: GRPO with Brier score reward via Tinker
- LoRA rank: 32, learning rate 4e-5, batch size 32, group size 8, 100 steps
Usage
The adapter uses Tinker's module naming convention, so it requires a merge step before inference. A standalone merge.py script is included.
Merge into full model
pip install torch transformers safetensors tqdm huggingface-hub
python merge.py --output ./golf-forecaster-merged
Inference
import sglang as sgl
engine = sgl.Engine(
model_path="./golf-forecaster-merged",
tokenizer_path="openai/gpt-oss-120b",
trust_remote_code=True,
dtype="bfloat16",
tp_size=2,
)
news_context = "... relevant news articles ..."
prompt = f"""You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".
Question: Will Scottie Scheffler win the 2025 Masters?
Context:
{news_context}
Respond with your reasoning, then give your final answer as a probability between 0 and 1 inside <answer></answer> tags."""
output = engine.generate(prompt, sampling_params={"max_new_tokens": 4096, "stop": ["</answer>"]})
print(output["text"])
Links
- Dataset: LightningRodLabs/GolfForecasting
- Training platform: Tinker
- Data generation: Lightning Rod SDK
- Future-as-Label paper: arxiv:2601.06336
- Outcome-based RL paper: arxiv:2505.17989


