File size: 16,421 Bytes
3e6f1d3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
# NBA Sage - Technical Explanation

> **An AI-powered NBA game prediction system with real-time data, machine learning, and a modern web interface.**

---

## ๐ŸŽฏ What Does This Project Do?

NBA Sage is a full-stack application that:

1. **Predicts NBA game outcomes** before they happen
2. **Shows live scores** with real-time updates
3. **Tracks prediction accuracy** over time
4. **Calculates MVP race standings** based on current stats
5. **Estimates championship odds** for all 30 teams

---

## ๐Ÿ† Key Features

| Feature | Description |
|---------|-------------|
| **Live Game Dashboard** | Real-time scores, game status, win probabilities |
| **Win Predictions** | Probability % for each team to win |
| **Starting 5 Lineups** | Projected starters with PPG stats from NBA API |
| **MVP Race** | Top 10 MVP candidates with scores |
| **Championship Odds** | All 30 teams ranked by title probability |
| **Model Accuracy** | Track how well predictions perform over time |

---

## ๐Ÿ› ๏ธ Technology Stack

### Backend (Python)
| Technology | Purpose |
|------------|---------|
| **Flask** | REST API framework |
| **nba_api** | Official NBA data (stats.nba.com) |
| **XGBoost + LightGBM** | Machine learning ensemble model |
| **APScheduler** | Background job scheduling |
| **ChromaDB Cloud** | Persistent prediction storage |
| **Pandas/NumPy** | Data processing |

### Frontend (React)
| Technology | Purpose |
|------------|---------|
| **React 18** | UI framework |
| **Vite** | Build tool & dev server |
| **Custom CSS** | Modern design system |

### Infrastructure
| Technology | Purpose |
|------------|---------|
| **Docker** | Container deployment |
| **Hugging Face Spaces** | Cloud hosting |
| **Git LFS** | Large file versioning |

---

## ๐Ÿ”ฌ How Predictions Work

### The Prediction Algorithm

Predictions are made using a **multi-factor formula**:

```
Win Probability = Log5 Formula of:
โ”œโ”€โ”€ 40% - Current Season Record (Win %)
โ”œโ”€โ”€ 30% - Recent Form (Last 10 games performance)
โ”œโ”€โ”€ 20% - ELO Rating (Historical team strength)
โ””โ”€โ”€ 10% - Baseline

Adjustments Applied:
โ”œโ”€โ”€ +3.5% for Home Court Advantage
โ””โ”€โ”€ -2% per Injury Impact Point
```

### ELO Rating System

ELO is a chess-inspired rating system adapted for NBA:

- **Starting rating**: 1500 (average team)
- **K-factor**: 20 (how much ratings change per game)
- **Home advantage**: +100 ELO points equivalent
- **Season regression**: Ratings regress 25% to mean each season

**How it works:**
- Win against better team โ†’ Big ELO gain
- Win against weaker team โ†’ Small ELO gain
- Lose against better team โ†’ Small ELO loss
- Lose against weaker team โ†’ Big ELO loss

---

## ๐Ÿ“Š Data Sources

### Real-Time Data
- **NBA Live API** (`nba_api.live`)
  - Live scores updated every 30 seconds
  - Game status (scheduled, in progress, final)
  - Box scores and player stats

### Historical Data
- **NBA Stats API** (`nba_api.stats`)
  - 23 years of game data (2003-2026)
  - Team statistics (basic, advanced, clutch, hustle)
  - Player statistics
  - Current season stats for predictions

### Data Storage
- **Parquet files**: Cached API responses (~140 files)
- **ChromaDB Cloud**: Prediction history and accuracy tracking
- **Joblib files**: Trained ML model and processed datasets

---

## ๐Ÿง  Machine Learning Components

### Trained Model: XGBoost + LightGBM Ensemble

Two gradient boosting models trained on 41,000+ historical games:

```
Game Features โ”€โ”€โ”ฌโ”€โ”€โ–บ XGBoost (50%) โ”€โ”€โ”
                โ”‚                    โ”‚โ”€โ”€โ–บ Ensemble Prediction
                โ””โ”€โ”€โ–บ LightGBM (50%) โ”€โ”˜
```

**Features Used:**
- ELO ratings and differentials
- Rolling averages (5, 10, 20 game windows)
- Rest days and back-to-back games
- Home/away status
- Season record statistics

### Training Pipeline

```
Data Collection โ”€โ”€โ–บ Feature Engineering โ”€โ”€โ–บ Model Training โ”€โ”€โ–บ Evaluation
         โ”‚                  โ”‚                    โ”‚
         โ–ผ                  โ–ผ                    โ–ผ
    NBA API Data      ELO Calculation      XGBoost+LightGBM
                      Era Normalization
                      Rolling Windows
```

### Auto-Training System

The system automatically retrains itself:

1. **Ingests completed games** every hour
2. **Waits for all daily games** to complete
3. **Compares new model accuracy** to existing
4. **Only updates if improved** (prevents regression)

---

## ๐ŸŒ System Architecture

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        React Frontend                            โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”           โ”‚
โ”‚  โ”‚LiveGames โ”‚ โ”‚Predictionsโ”‚ โ”‚MVP Race  โ”‚ โ”‚ Accuracy โ”‚           โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ–ฒโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ–ฒโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ–ฒโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ–ฒโ”€โ”€โ”€โ”€โ”€โ”˜           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚            โ”‚            โ”‚            โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚ REST API
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Flask Server                                โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚   Endpoints    โ”‚  โ”‚    Caching     โ”‚  โ”‚  Scheduler     โ”‚    โ”‚
โ”‚  โ”‚  /api/live     โ”‚  โ”‚  In-Memory     โ”‚  โ”‚  APScheduler   โ”‚    โ”‚
โ”‚  โ”‚  /api/roster   โ”‚  โ”‚  1-hour rostersโ”‚  โ”‚  Auto-retrain  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚           โ”‚                                                      โ”‚
โ”‚           โ–ผ                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚              Prediction Pipeline                        โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚    โ”‚
โ”‚  โ”‚  โ”‚Live Collectorโ”‚ โ”‚Feature Gen  โ”‚ โ”‚ ELO System  โ”‚       โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      External Services                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”             โ”‚
โ”‚  โ”‚  NBA API    โ”‚  โ”‚ ChromaDB    โ”‚  โ”‚ Hugging Faceโ”‚             โ”‚
โ”‚  โ”‚ stats.nba   โ”‚  โ”‚   Cloud     โ”‚  โ”‚   Spaces    โ”‚             โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

---

## ๐Ÿ“ Project Structure

```
NBA ML/
โ”œโ”€โ”€ server.py              # Production server (Hugging Face)
โ”œโ”€โ”€ api/api.py             # Development server
โ”‚
โ”œโ”€โ”€ src/                   # Core logic
โ”‚   โ”œโ”€โ”€ prediction_pipeline.py   # Main orchestrator
โ”‚   โ”œโ”€โ”€ feature_engineering.py   # ELO + features
โ”‚   โ”œโ”€โ”€ data_collector.py        # Historical data
โ”‚   โ”œโ”€โ”€ live_data_collector.py   # Real-time data
โ”‚   โ”œโ”€โ”€ prediction_tracker.py    # Accuracy tracking
โ”‚   โ””โ”€โ”€ models/
โ”‚       โ””โ”€โ”€ game_predictor.py    # ML model
โ”‚
โ”œโ”€โ”€ web/                   # React frontend
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ App.jsx
โ”‚       โ”œโ”€โ”€ pages/         # UI pages
โ”‚       โ””โ”€โ”€ index.css      # Design system
โ”‚
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ api_data/          # 140+ parquet files
โ”‚
โ””โ”€โ”€ models/
    โ””โ”€โ”€ game_predictor.joblib  # Trained model (9.6KB)
```

---

## ๐Ÿš€ Deployment

### Local Development
```bash
# Backend
python api/api.py  # Runs on localhost:8000

# Frontend
cd web && npm run dev  # Runs on localhost:5173
```

### Production (Hugging Face Spaces)
```bash
# Docker container
python server.py  # Serves both API + React on port 7860
```

---

## ๐Ÿ“ˆ Performance & Accuracy

### Prediction Accuracy
- **Overall**: Tracked via ChromaDB Cloud
- **By Confidence**: High/Medium/Low confidence splits
- **By Team**: Per-team prediction accuracy

### Speed Optimizations
- **In-memory caching**: Roster data cached for 1 hour
- **Startup warming**: All 30 teams pre-loaded on server start
- **Background refresh**: Cache updated every 2 hours

---

## ๐Ÿ”ฎ Future Improvements

1. **Integrate ML model** into live predictions (currently formula-based)
2. **Add player-level features** (injuries, rest days per player)
3. **Implement spread predictions** (margin of victory)
4. **Add playoff predictions** with series outcomes

---

## ๐Ÿ“Š Stats at a Glance

| Metric | Value |
|--------|-------|
| Historical games | 41,000+ |
| Seasons covered | 23 (2003-2026) |
| Teams tracked | 30 |
| ML model type | XGBoost + LightGBM |
| API endpoints | 10+ |
| Frontend pages | 6 |
---

## ๐Ÿ“‹ Complete ML Feature List (90+ Features)

The model uses approximately **90 features** organized into these categories:

### 1๏ธโƒฃ ELO Rating Features (5 features)
| Feature | Description |
|---------|-------------|
| `team_elo` | Team's current ELO rating |
| `opponent_elo` | Opponent's current ELO rating |
| `elo_diff` | Difference between team and opponent ELO |
| `elo_win_prob` | Expected win probability from ELO |
| `home_elo_boost` | ELO boost for home court (100 points) |

### 2๏ธโƒฃ Basic Stats - Rolling Averages (21 features)
For each of 7 stats ร— 3 windows (5, 10, 20 games):

| Base Stat | Windows |
|-----------|---------|
| `PTS` (Points) | `PTS_last5`, `PTS_last10`, `PTS_last20` |
| `AST` (Assists) | `AST_last5`, `AST_last10`, `AST_last20` |
| `REB` (Rebounds) | `REB_last5`, `REB_last10`, `REB_last20` |
| `FG_PCT` (Field Goal %) | `FG_PCT_last5`, `FG_PCT_last10`, `FG_PCT_last20` |
| `FG3_PCT` (3-Point %) | `FG3_PCT_last5`, `FG3_PCT_last10`, `FG3_PCT_last20` |
| `FT_PCT` (Free Throw %) | `FT_PCT_last5`, `FT_PCT_last10`, `FT_PCT_last20` |
| `PLUS_MINUS` (Point Diff) | `PLUS_MINUS_last5`, `PLUS_MINUS_last10`, `PLUS_MINUS_last20` |

### 3๏ธโƒฃ Season Statistics (9 features)
| Feature | Description |
|---------|-------------|
| `PTS_season_avg` | Season average points |
| `AST_season_avg` | Season average assists |
| `REB_season_avg` | Season average rebounds |
| `FG_PCT_season_avg` | Season field goal % |
| `FG3_PCT_season_avg` | Season 3-point % |
| `FT_PCT_season_avg` | Season free throw % |
| `PLUS_MINUS_season_avg` | Season point differential |
| `win_pct_season` | Season win percentage |
| `games_played` | Games played in season |

### 4๏ธโƒฃ Defensive Features (4 features)
| Feature | Description |
|---------|-------------|
| `STL_last10` | Steals per game (last 10) |
| `BLK_last10` | Blocks per game (last 10) |
| `DREB_last10` | Defensive rebounds (last 10) |
| `pts_allowed_last10` | Points allowed (last 10) |

### 5๏ธโƒฃ Momentum Features (6 features)
| Feature | Description |
|---------|-------------|
| `wins_last5` | Wins in last 5 games (0-5) |
| `wins_last10` | Wins in last 10 games (0-10) |
| `hot_streak` | 1 if 4+ wins in last 5 |
| `cold_streak` | 1 if 1 or fewer wins in last 5 |
| `plus_minus_last5` | Point differential trend |
| `form_trend` | Comparison of last 3 vs previous 3 |

### 6๏ธโƒฃ Rest & Fatigue Features (4 features)
| Feature | Description |
|---------|-------------|
| `days_rest` | Days since last game |
| `back_to_back` | 1 if playing consecutive days |
| `well_rested` | 1 if 3+ days rest |
| `games_last_week` | Games played in last 7 days |

### 7๏ธโƒฃ Form Index Features (3 features)
| Feature | Description |
|---------|-------------|
| `form_index` | Exponentially-weighted recent performance (0-1) |
| `form_trend` | Trend direction (improving/declining) |
| `form_plus_minus` | Weighted point differential |

### 8๏ธโƒฃ Basic Stat Columns (17 raw features)
```python
BASIC_STATS = [
    "PTS", "AST", "REB", "STL", "BLK", "TOV",
    "FGM", "FGA", "FG_PCT",
    "FG3M", "FG3A", "FG3_PCT",
    "FTM", "FTA", "FT_PCT",
    "OREB", "DREB"
]
```

### 9๏ธโƒฃ Advanced Team Stats (11 features)
```python
ADVANCED_STATS = [
    "E_OFF_RATING",    # Offensive Rating
    "E_DEF_RATING",    # Defensive Rating
    "E_NET_RATING",    # Net Rating
    "E_PACE",          # Pace (possessions per game)
    "E_AST_RATIO",     # Assist Ratio
    "E_OREB_PCT",      # Offensive Rebound %
    "E_DREB_PCT",      # Defensive Rebound %
    "E_REB_PCT",       # Total Rebound %
    "E_TM_TOV_PCT",    # Team Turnover %
    "E_EFG_PCT",       # Effective FG%
    "E_TS_PCT"         # True Shooting %
]
```

### ๐Ÿ”Ÿ Clutch Stats (4 features)
```python
CLUTCH_STATS = [
    "CLUTCH_PTS",          # Points in clutch time
    "CLUTCH_FG_PCT",       # FG% in clutch
    "CLUTCH_FG3_PCT",      # 3PT% in clutch
    "CLUTCH_PLUS_MINUS"    # +/- in clutch
]
```

### 1๏ธโƒฃ1๏ธโƒฃ Hustle Stats (5 features)
```python
HUSTLE_STATS = [
    "DEFLECTIONS",             # Passes deflected
    "LOOSE_BALLS_RECOVERED",   # Loose balls recovered
    "CHARGES_DRAWN",           # Offensive fouls drawn
    "CONTESTED_SHOTS",         # Shots contested
    "SCREEN_ASSISTS"           # Screen assists
]
```

### 1๏ธโƒฃ2๏ธโƒฃ Top Player Stats (6 features)
| Feature | Description |
|---------|-------------|
| `top_players_avg_pts` | Avg points of top 5 players |
| `top_players_avg_ast` | Avg assists of top 5 players |
| `top_players_avg_reb` | Avg rebounds of top 5 players |
| `top_players_avg_stl` | Avg steals of top 5 players |
| `top_players_avg_blk` | Avg blocks of top 5 players |
| `star_concentration` | % of scoring from top player |

### 1๏ธโƒฃ3๏ธโƒฃ Game Context (1 feature)
| Feature | Description |
|---------|-------------|
| `is_home` | 1 if home team, 0 if away |

---

## ๐Ÿ“Š Feature Summary

| Category | Feature Count |
|----------|---------------|
| ELO Ratings | 5 |
| Rolling Averages (5/10/20) | 21 |
| Season Statistics | 9 |
| Defensive Stats | 4 |
| Momentum Features | 6 |
| Rest/Fatigue | 4 |
| Form Index | 3 |
| Advanced Team Stats | 11 |
| Clutch Stats | 4 |
| Hustle Stats | 5 |
| Top Player Stats | 6 |
| Game Context | 1 |
| **TOTAL** | **~79 core features** |

*Plus Z-score normalized versions of stats for era adjustment = **90+ total features***

---

*Built with Python, React, and a passion for basketball analytics* ๐Ÿ€