datamatters24 commited on
Commit
5d63e3a
·
verified ·
1 Parent(s): 827287e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - xgboost
5
+ - lightgbm
6
+ - sports-prediction
7
+ - formula1
8
+ - tabular
9
+ - classification
10
+ - ensemble
11
+ - optuna
12
+ language:
13
+ - en
14
+ ---
15
+
16
+ # Telemetry Chaos — F1 Race Prediction Model
17
+
18
+ XGBoost + LightGBM ensemble predicting Formula 1 race winners from 76 seasons of historical data. Tuned with Optuna hyperparameter optimization across 200 trials. Auto-retrains weekly during the active season.
19
+
20
+ **Live demo:** [telemetrychaos.space](https://telemetrychaos.space)
21
+
22
+ ## Performance
23
+
24
+ | Metric | Score |
25
+ |---|---|
26
+ | Top-1 Accuracy | 53% |
27
+ | Top-3 Accuracy | 85% |
28
+ | Top-5 Accuracy | 96% |
29
+
30
+ Evaluated on 2024–2025 seasons with time-series split to prevent data leakage.
31
+
32
+ ## Features (21 per driver per race)
33
+
34
+ - **Form:** Rolling average points, recent podiums, win streak
35
+ - **Pace:** Practice session lap time delta vs. teammate and field
36
+ - **Constructor:** Team rolling performance, reliability score
37
+ - **Track history:** Driver-specific circuit win rate, podium rate
38
+ - **Tyre:** Degradation profile, pit stop speed, strategy tendency
39
+ - **Conditions:** Weather forecast, safety car probability
40
+ - **Grid:** Starting position, qualifying gap to pole
41
+
42
+ ## Architecture
43
+
44
+ ```
45
+ XGBoost (GPU) + LightGBM
46
+ Optuna HPO: 200 trials, TPE sampler
47
+ Time-series split: train on seasons N-5 to N-1, evaluate on N
48
+ Final output: softmax win probabilities per driver
49
+ ```
50
+
51
+ ## Dataset
52
+
53
+ - **Coverage:** 1950–2025, 76 seasons
54
+ - **Records:** 1,322,914 race records
55
+ - **Telemetry laps:** 470K+
56
+ - **Sources:** FastF1, Jolpica-F1, f1db, Kaggle
57
+
58
+ ## Usage
59
+
60
+ ```python
61
+ import joblib
62
+
63
+ model = joblib.load("f1_ensemble.joblib")
64
+ # Input: 21-feature vector per driver
65
+ # Output: win probability (0-1)
66
+ probs = model.predict_proba(X)
67
+ ```
68
+
69
+ ## Auto-Update Pipeline
70
+
71
+ During the active F1 season the model retrains weekly:
72
+ 1. Pull latest race results and telemetry via FastF1
73
+ 2. Engineer features for upcoming race grid
74
+ 3. Retrain ensemble with updated data
75
+ 4. Publish updated predictions to telemetrychaos.space
76
+
77
+ ## Citation
78
+
79
+ ```bibtex
80
+ @misc{rubin2026telemetrychaos,
81
+ author = {Rubin, Theodore},
82
+ title = {Telemetry Chaos: F1 Race Prediction with XGBoost/LightGBM Ensemble},
83
+ year = {2026},
84
+ publisher = {HuggingFace},
85
+ url = {https://huggingface.co/datamatters24/f1-race-predictor-model}
86
+ }
87
+ ```