--- language: - en license: apache-2.0 library_name: peft tags: - forecasting - prediction - reinforcement-learning - grpo - lora - mixture-of-experts - golf - sports - future-as-label datasets: - LightningRodLabs/GolfForecasting base_model: openai/gpt-oss-120b pipeline_tag: text-generation model-index: - name: Golf-Forecaster results: - task: type: text-generation name: Probabilistic Forecasting dataset: name: GolfForecasting type: LightningRodLabs/GolfForecasting split: test metrics: - type: brier_score value: 0.207 name: Brier Score - type: ece value: 0.062 name: Expected Calibration Error --- # Golf-Forecaster ### RL-Tuned gpt-oss-120b for Predicting Professional Golf Outcomes Starting from nothing but 9 search queries, we used the [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk) to automatically generate [3,178 forecasting questions](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) from news articles, label them using real outcomes, and train this model via RL. **No expertise required. No manual labeling. No domain-specific engineering.** The result beats GPT-5 on held-out questions. You can do this in any domain — just change the search queries. See [how we built the dataset](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting). This repo contains a **LoRA adapter** for [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b). A standalone `merge.py` script is included to merge it into a full model. --- ## Results Evaluated on 855 held-out test questions (temporal split, Aug 2025+). | Model | Brier Score | Brier Skill Score | ECE | |-------|:---:|:---:|:---:| | **Golf-Forecaster** | **0.207** | **+17.0%** | **0.062** | | gpt-oss-120b (base) | 0.218 | +12.8% | 0.083 | | GPT-5 | 0.218 | +12.8% | 0.106 | ![Brier Skill Score](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting/resolve/main/brier_skill_score.png) ![Brier Score Comparison](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting/resolve/main/brier_score_comparison.png) ![ECE Comparison](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting/resolve/main/ece_comparison.png) **Brier Score**: Mean squared error between predicted probability and outcome. Lower is better. **BSS** measures improvement over always predicting the base rate. **ECE**: Whether predicted probabilities match actual frequencies. Lower is better. --- ## Training - **Base model**: [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (120B MoE, 5.1B active params) - **Method**: GRPO with Brier score reward via [Tinker](https://tinker.computer) - **LoRA rank**: 32, learning rate 4e-5, batch size 32, group size 8, 100 steps --- ## Usage The adapter uses Tinker's module naming convention, so it requires a merge step before inference. A standalone `merge.py` script is included. ### Merge into full model ```bash pip install torch transformers safetensors tqdm huggingface-hub python merge.py --output ./golf-forecaster-merged ``` ### Inference ```python import sglang as sgl engine = sgl.Engine( model_path="./golf-forecaster-merged", tokenizer_path="openai/gpt-oss-120b", trust_remote_code=True, dtype="bfloat16", tp_size=2, ) news_context = "... relevant news articles ..." prompt = f"""You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes". Question: Will Scottie Scheffler win the 2025 Masters? Context: {news_context} Respond with your reasoning, then give your final answer as a probability between 0 and 1 inside tags.""" output = engine.generate(prompt, sampling_params={"max_new_tokens": 4096, "stop": [""]}) print(output["text"]) ``` --- ## Links - **Dataset**: [LightningRodLabs/GolfForecasting](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) - **Training platform**: [Tinker](https://tinker.computer) - **Data generation**: [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk) - **Future-as-Label paper**: [arxiv:2601.06336](https://arxiv.org/abs/2601.06336) - **Outcome-based RL paper**: [arxiv:2505.17989](https://arxiv.org/abs/2505.17989)