Bturtel commited on
Commit
90c19f5
·
verified ·
1 Parent(s): 25244b1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +144 -0
README.md ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: peft
6
+ tags:
7
+ - forecasting
8
+ - prediction
9
+ - reinforcement-learning
10
+ - grpo
11
+ - lora
12
+ - mixture-of-experts
13
+ - golf
14
+ - sports
15
+ - future-as-label
16
+ datasets:
17
+ - LightningRodLabs/GolfForecasting
18
+ base_model: openai/gpt-oss-120b
19
+ pipeline_tag: text-generation
20
+ model-index:
21
+ - name: Golf-Forecaster
22
+ results:
23
+ - task:
24
+ type: text-generation
25
+ name: Probabilistic Forecasting
26
+ dataset:
27
+ name: GolfForecasting
28
+ type: LightningRodLabs/GolfForecasting
29
+ split: test
30
+ metrics:
31
+ - type: brier_score
32
+ value: 0.207
33
+ name: Brier Score
34
+ - type: ece
35
+ value: 0.062
36
+ name: Expected Calibration Error
37
+ ---
38
+
39
+ # Golf-Forecaster
40
+
41
+ ### RL-Tuned gpt-oss-120b for Predicting Professional Golf Outcomes
42
+
43
+ We fine-tuned [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) with reinforcement learning to predict professional golf outcomes across PGA Tour, LIV Golf, LPGA, DP World Tour, majors, and the Ryder Cup. Trained on the [GolfForecasting](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) dataset of 3,178 binary forecasting questions generated with the [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk), Golf-Forecaster beats GPT-5.1 on held-out forecasting questions.
44
+
45
+ This repo contains a **LoRA adapter** (5.3 GB) for gpt-oss-120b. A standalone `merge.py` script is included to produce a full merged model if needed.
46
+
47
+ [Dataset](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting) · [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk) · [Future-as-Label paper](https://arxiv.org/abs/2601.06336) · [Outcome-based RL paper](https://arxiv.org/abs/2505.17989)
48
+
49
+ ---
50
+
51
+ ## Results
52
+
53
+ Evaluated on 855 held-out test questions (temporal split, Aug 2025+). Golf-Forecaster achieves the best Brier score, highest skill score, and best calibration.
54
+
55
+ | Model | Brier Score | Brier Skill Score | ECE |
56
+ |-------|:---:|:---:|:---:|
57
+ | **Golf-Forecaster** | **0.207** | **+17.0%** | **0.062** |
58
+ | gpt-oss-120b (base) | 0.218 | +12.8% | 0.083 |
59
+ | GPT-5.1 | 0.218 | +12.8% | 0.106 |
60
+
61
+ ### Metrics
62
+
63
+ - **Brier Score**: Mean squared error between predicted probability and outcome (0 or 1). Lower is better. **Brier Skill Score (BSS)** expresses this as improvement over always predicting the base rate — positive means the model learned something useful beyond historical frequency.
64
+ - **Expected Calibration Error (ECE)**: Measures whether predicted probabilities match actual frequencies. "70%" predictions should resolve "yes" 70% of the time. Lower is better.
65
+
66
+ ---
67
+
68
+ ## Training
69
+
70
+ - **Base model**: [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (120B MoE, 5.1B active params, 128 experts Top-4)
71
+ - **Method**: GRPO with Brier score reward via [Tinker](https://tinker.computer)
72
+ - **LoRA rank**: 32
73
+ - **Learning rate**: 4e-5
74
+ - **Batch size**: 32, group size 8
75
+ - **Training steps**: 100
76
+ - **Max tokens**: 16,384
77
+
78
+ ---
79
+
80
+ ## Usage
81
+
82
+ This repo contains a LoRA adapter trained with [Tinker](https://tinker.computer). The adapter uses Tinker's module naming convention, so it requires a merge step before inference. A standalone `merge.py` script is included.
83
+
84
+ ### Merge into full model
85
+
86
+ ```bash
87
+ pip install torch transformers safetensors tqdm huggingface-hub
88
+ python merge.py --output ./golf-forecaster-merged
89
+ ```
90
+
91
+ This downloads the base model, dequantizes to bf16, applies the LoRA adapter, and saves the merged model.
92
+
93
+ ### Inference with the merged model
94
+
95
+ With [SGLang](https://github.com/sgl-project/sglang) (recommended for MoE):
96
+
97
+ ```python
98
+ import sglang as sgl
99
+
100
+ engine = sgl.Engine(
101
+ model_path="./golf-forecaster-merged",
102
+ tokenizer_path="openai/gpt-oss-120b",
103
+ trust_remote_code=True,
104
+ dtype="bfloat16",
105
+ tp_size=2,
106
+ )
107
+
108
+ prompt = """You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".
109
+
110
+ Question: Will Scottie Scheffler win the 2025 Masters?
111
+
112
+ Respond with your reasoning, then give your final answer as a probability between 0 and 1 inside <answer></answer> tags."""
113
+
114
+ output = engine.generate(prompt, sampling_params={"max_new_tokens": 4096, "stop": ["</answer>"]})
115
+ print(output["text"])
116
+ ```
117
+
118
+ Or with transformers:
119
+
120
+ ```python
121
+ from transformers import AutoModelForCausalLM, AutoTokenizer
122
+
123
+ model = AutoModelForCausalLM.from_pretrained(
124
+ "./golf-forecaster-merged",
125
+ torch_dtype="auto",
126
+ device_map="auto",
127
+ trust_remote_code=True,
128
+ )
129
+ tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-120b", trust_remote_code=True)
130
+
131
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
132
+ outputs = model.generate(**inputs, max_new_tokens=4096, do_sample=True, temperature=0.7)
133
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
134
+ ```
135
+
136
+ ---
137
+
138
+ ## Links
139
+
140
+ - **Dataset**: [LightningRodLabs/GolfForecasting](https://huggingface.co/datasets/LightningRodLabs/GolfForecasting)
141
+ - **Training platform**: [Tinker](https://tinker.computer)
142
+ - **Data generation**: [Lightning Rod SDK](https://github.com/lightning-rod-labs/lightningrod-python-sdk)
143
+ - **Future-as-Label paper**: [arxiv:2601.06336](https://arxiv.org/abs/2601.06336)
144
+ - **Outcome-based RL paper**: [arxiv:2505.17989](https://arxiv.org/abs/2505.17989)