Syntrex's picture
Update SQL
3b144a1 verified
Here is Claude's plan: β•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œ Fix: Multi-book display + Full Pre-Game Model Stack for HR%
Context
Props are now rendering (Caesars only, model HR% blank). Three issues to fix:
Issue A β€” Only one sportsbook showing (Caesars)
Stop-at-first provider logic returns Caesars (from Odds API) and never runs the scraper for
DK/FD/BetMGM. Fix: concat-all providers, dedup by best odds per player+book+market.
Issue B β€” Model HR% blank (pre-season statcast empty)
load_statcast_recent() returns empty pre-season (no 2026 games). Name index is empty β†’ all
players return "unavailable". Fix: fall back to load_statcast_previous_season_full() (2025).
Issue C β€” Model uses only batter baseline
analytics/props_mapper.py currently applies only compute_batter_baseline(). Full pre-game
model includes: park factor, pitcher quality, zone matchup, arsenal matchup, rolling form.
Key data gap: (1) statcast_df is batter-perspective (player_type=batter) β€” no pitcher rows.
(2) Props rows have no pitcher name or venue. Both must be resolved.
---
Files to Modify / Create
1. data/live_prop_odds.py β€” concat-all with dedup (Fix A)
2. app.py β€” statcast fallback + pitcher statcast loader + Props call site changes (Fixes B + C)
3. data/statcast.py β€” add player_type param to _query_statcast()
4. data/mlb_starters.py (NEW) β€” probable starters from MLB Stats API
5. analytics/props_mapper.py β€” full pre-game model stack (Fix C)
6. visualization/props_page.py β€” pass pitcher_statcast_df + probable_starters through
---
Fix A: data/live_prop_odds.py β€” concat-all with dedup
Replace stop-at-first loop. Run ALL providers, concatenate, dedup keeping best odds.
# CURRENT loop (stop-at-first):
for provider in providers:
...
if not df.empty:
return normalize_prop_odds(df) # exits on first non-empty
return pd.DataFrame()
# NEW loop (concat-all, then dedup):
frames = []
for provider in providers:
try:
fetch_fn = getattr(provider, "fetch_all_upcoming_hr_props", None)
if fetch_fn is None:
continue
df = fetch_fn(sportsbooks=sportsbooks)
if not df.empty:
frames.append(df)
except Exception as e:
logger.warning(f"[odds_provider_fetch] failure: {e}", exc_info=True)
continue
if not frames:
return pd.DataFrame()
merged = pd.concat(frames, ignore_index=True)
merged = normalize_prop_odds(merged)
# Dedup: keep one row per (player_name, sportsbook_key, market) β€” best odds wins
if not merged.empty and "sportsbook_key" in merged.columns:
merged["_odds_score"] = merged["odds_american"].apply(
lambda x: int(x) if pd.notna(x) else -9999
)
merged = (
merged
.sort_values("_odds_score", ascending=False)
.drop_duplicates(subset=["player_name", "sportsbook_key", "market"], keep="first")
.drop(columns=["_odds_score"])
.reset_index(drop=True)
)
logger.warning(
"[fetch_all_upcoming_hr_props] providers=%d frames=%d merged_rows=%d unique_books=%s",
len(providers), len(frames), len(merged),
sorted(merged["sportsbook"].dropna().unique().tolist()) if not merged.empty else [],
)
return merged
---
Fix B: app.py β€” statcast fallback for pre-season (line 3409 area)
# CURRENT:
render_props(load_statcast_recent(), conn=conn, raw_props=load_upcoming_hr_props())
# NEW:
_statcast_for_props = load_statcast_recent()
if _statcast_for_props.empty:
_statcast_for_props = load_statcast_previous_season_full()
---
Fix C part 1: data/statcast.py β€” add player_type param
_query_statcast() currently hardcodes "player_type": "batter". Add a parameter so pitcher
perspective can be fetched separately.
def _query_statcast(start_date: str, end_date: str, season: str, player_type: str = "batter") -> pd.DataFrame:
params = {
...
"player_type": player_type, # was hardcoded "batter"
...
}
fetch_statcast_range() keeps its existing signature unchanged (defaults to batter).
Add new public function:
def fetch_statcast_range_pitcher(start_date: str, end_date: str) -> pd.DataFrame:
"""Fetch pitcher-perspective Statcast (player_name = pitcher name)."""
season = str(datetime.strptime(start_date, "%Y-%m-%d").year)
return _query_statcast(start_date, end_date, season=season, player_type="pitcher")
---
Fix C part 2: app.py β€” load pitcher statcast + probable starters
Add alongside load_statcast_previous_season_full():
@st.cache_data(ttl=60 * 60 * 12, show_spinner=False)
def load_statcast_previous_season_full_pitcher() -> pd.DataFrame:
"""2025 season, pitcher perspective β€” player_name = pitcher name."""
today = pd.Timestamp.utcnow().date()
previous_year = today.year - 1
start_date = pd.Timestamp(year=previous_year, month=1, day=1).date()
end_date = pd.Timestamp(year=previous_year, month=12, day=31).date()
from data.statcast import fetch_statcast_range_pitcher
from data.statcast import normalize_statcast
from features.pitch_features import add_pitch_features
raw = fetch_statcast_range_pitcher(start_date.isoformat(), end_date.isoformat())
normalized = normalize_statcast(raw)
return add_pitch_features(normalized)
Update the Props call site:
_statcast_for_props = load_statcast_recent()
if _statcast_for_props.empty:
_statcast_for_props = load_statcast_previous_season_full()
_pitcher_statcast = load_statcast_previous_season_full_pitcher()
from data.mlb_starters import fetch_probable_starters_for_props
_probable_starters = fetch_probable_starters_for_props() # {(away_team, home_team): {...}}
render_props(
_statcast_for_props,
conn=conn,
raw_props=load_upcoming_hr_props(),
pitcher_statcast_df=_pitcher_statcast,
probable_starters=_probable_starters,
)
---
Fix C part 3: data/mlb_starters.py (NEW FILE)
Fetches probable starters from MLB Stats API for the next 7 days. Returns a dict mapping
(away_team, home_team) -> {"home_pitcher": str, "away_pitcher": str} for use in props_mapper.
Uses the public statsapi.mlb.com/api/v1/schedule endpoint with hydrate=probablePitcher.
Cached in the calling layer (Streamlit cache_data). No API key required.
MLB API team IDs for team name matching β€” response includes teams.home.team.name and
teams.away.team.name alongside teams.home.probablePitcher.fullName.
# data/mlb_starters.py
from __future__ import annotations
import logging
from datetime import timedelta
import requests
import pandas as pd
_log = logging.getLogger(__name__)
_SCHEDULE_URL = "https://statsapi.mlb.com/api/v1/schedule"
def fetch_probable_starters_for_props() -> dict[tuple[str, str], dict[str, str | None]]:
"""
Returns {(away_team, home_team): {"home_pitcher": name_or_None, "away_pitcher": name_or_None}}
for all MLB games in the next 7 days.
"""
today = pd.Timestamp.utcnow().date()
end_date = today + timedelta(days=7)
params = {
"sportId": 1,
"startDate": today.isoformat(),
"endDate": end_date.isoformat(),
"hydrate": "probablePitcher",
"gameType": "R,F,D,L,W",
}
try:
r = requests.get(_SCHEDULE_URL, params=params, timeout=15)
r.raise_for_status()
except Exception as exc:
_log.warning("[mlb_starters] schedule fetch failed: %s", exc)
return {}
result: dict[tuple[str, str], dict[str, str | None]] = {}
for date_entry in r.json().get("dates", []):
for game in date_entry.get("games", []):
teams = game.get("teams", {})
away_name = teams.get("away", {}).get("team", {}).get("name", "")
home_name = teams.get("home", {}).get("team", {}).get("name", "")
away_pitcher = teams.get("away", {}).get("probablePitcher", {}).get("fullName")
home_pitcher = teams.get("home", {}).get("probablePitcher", {}).get("fullName")
if away_name and home_name:
result[(away_name, home_name)] = {
"home_pitcher": home_pitcher,
"away_pitcher": away_pitcher,
}
_log.warning("[mlb_starters] games_with_starters=%d", sum(1 for v in result.values() if v["home_pitcher"] or v["away_pitcher"]))
return result
Note: MLB API team names (e.g. "New York Yankees") may differ from props row team names (The
Odds API uses "New York Yankees" format). A fuzzy match or alias map may be needed; add
_normalize_team(name) helper that lowercases + strips punctuation for comparison.
---
Fix C part 4: analytics/props_mapper.py β€” full pre-game model stack
4a. Add HOME_TEAM_TO_STADIUM mapping (30 MLB teams β†’ canonical stadium name)
# Maps Odds API home_team names β†’ canonical names accepted by models/stadium_lookup.resolve_stadium()
HOME_TEAM_TO_STADIUM: dict[str, str] = {
"Baltimore Orioles": "oriole park at camden yards",
"Boston Red Sox": "fenway park",
"New York Yankees": "yankee stadium",
"Tampa Bay Rays": "tropicana field",
"Toronto Blue Jays": "rogers centre",
"Chicago White Sox": "guaranteed rate field",
"Cleveland Guardians": "progressive field",
"Detroit Tigers": "comerica park",
"Kansas City Royals": "kauffman stadium",
"Minnesota Twins": "target field",
"Houston Astros": "minute maid park",
"Los Angeles Angels": "angel stadium",
"Oakland Athletics": "athletics ballpark",
"Seattle Mariners": "t-mobile park",
"Texas Rangers": "globe life field",
"Atlanta Braves": "truist park",
"Miami Marlins": "loandepot park",
"New York Mets": "citi field",
"Philadelphia Phillies": "citizens bank park",
"Washington Nationals": "nationals park",
"Chicago Cubs": "wrigley field",
"Cincinnati Reds": "great american ball park",
"Milwaukee Brewers": "american family field",
"Pittsburgh Pirates": "pnc park",
"St. Louis Cardinals": "busch stadium",
"Arizona Diamondbacks": "chase field",
"Colorado Rockies": "coors field",
"Los Angeles Dodgers": "dodger stadium",
"San Diego Padres": "petco park",
"San Francisco Giants": "oracle park",
}
4b. Add batter team lookup helper
def _lookup_batter_team(
player_name_normalized: str,
props_row_away_team: str,
props_row_home_team: str,
statcast_df: pd.DataFrame,
) -> str | None:
"""
Returns "home" or "away" for which team the batter plays on, or None if unknown.
Uses statcast game records: find rows where this player appears and check if the
game's away_team or home_team matches the props row matchup.
"""
# Try to match batter's statcast rows against the props game teams
if statcast_df.empty or "player_name" not in statcast_df.columns:
return None
# (implementation: filter statcast by player_name, check home_team/away_team columns)
4c. Replace _get_pregame_context_adjustments() with _get_full_pregame_adjustments()
New signature:
def _get_full_pregame_adjustments(
props_row: Any,
batter_features: dict, # already computed batter baseline features
statcast_df: pd.DataFrame, # batter-perspective (player_name = batter)
pitcher_statcast_df: pd.DataFrame, # pitcher-perspective (player_name = pitcher)
probable_starters: dict, # {(away_team, home_team): {home_pitcher, away_pitcher}}
) -> tuple[float, str]: # (total_adj, source_detail_str)
Returns total additive HR probability adjustment and a pipe-separated source string
(e.g. "baseline+pitcher_quality+park+rolling_form+zone_matchup+arsenal_matchup").
Components applied in order, each try/except, no-op on failure:
1. Park factor β€” via HOME_TEAM_TO_STADIUM[home_team] β†’ resolve_stadium() β†’
compute_park_adjustment() β†’ clamp to Β±0.015
2. Probable pitcher lookup β€” match (away_team, home_team) in probable_starters dict
(fuzzy normalize both sides). Determine batter's team β†’ select opposing starting pitcher.
3. Pitcher quality (proper) β€” build_pitcher_feature_row(pitcher_statcast_df, pitcher_name)
β†’ compute_pitcher_adjustment(batter_features, pitcher_row) from
models/pitcher_adjustment.py. Use result's hr_adj directly. Clamp Β±0.010.
4. Zone matchup (if data available) β€” build_batter_zone_feature_row(statcast_df, player_name)
(from remote DB via batter_zone_store) + build_pitcher_zone_feature_row(pitcher_statcast_df, pitcher_name) β†’ compute_zone_matchup_adjustment(). Use hr_zone_boost
- baseline_hr delta,
clamped Β±0.010.
5. Arsenal matchup β€” build_batter_arsenal_feature_row(statcast_df, statcast_name)
- build_pitcher_arsenal_feature_row(pitcher_statcast_df, pitcher_name)
β†’ compute_arsenal_matchup_adjustment(). Use arsenal_hr_boost delta, clamped Β±0.010.
6. Rolling form β€” build_batter_rolling_form_row(statcast_df, statcast_name, reference_date=today)
- build_pitcher_rolling_form_row(pitcher_statcast_df, pitcher_name, reference_date=today)
β†’ compute_upcoming_rolling_adjustment(batter_roll, pitcher_roll, batter_features, pitcher_row).
Use rolling_hr_adjustment directly.
4d. Update map_hr_props_to_model() signature
def map_hr_props_to_model(
props_df: pd.DataFrame,
statcast_df: pd.DataFrame,
prob_fn: ... | None = None,
pitcher_stats_df: pd.DataFrame | None = None, # existing param (kept)
pitcher_statcast_df: pd.DataFrame | None = None, # NEW: pitcher-perspective statcast
probable_starters: dict | None = None, # NEW: {(away,home): {pitchers}}
) -> pd.DataFrame:
Inside the per-row loop, replace _get_pregame_context_adjustments() call with
_get_full_pregame_adjustments() call using the new params.
---
Fix C part 5: visualization/props_page.py β€” pass new params
Update render_props() signature to accept and forward:
def render_props(
statcast_df: pd.DataFrame,
conn=None,
raw_props: pd.DataFrame | None = None,
pitcher_statcast_df: pd.DataFrame | None = None, # NEW
probable_starters: dict | None = None, # NEW
) -> None:
At line 107:
# CURRENT:
mapped = map_hr_props_to_model(filtered_raw, statcast_df)
# NEW:
mapped = map_hr_props_to_model(
filtered_raw,
statcast_df,
pitcher_statcast_df=pitcher_statcast_df,
probable_starters=probable_starters,
)
---
Expected Behavior After All Fixes
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Scenario β”‚ Books shown β”‚ Model HR% β”‚ Model components β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Season in progress, all books posting β”‚ All books (Odds API + scraper) β”‚ Recent statcast β”‚ All pre-game models β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Pre-season, starters announced β”‚ 4 books (scraper) β”‚ 2025 season β”‚ All pre-game models β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Pre-season, no starters yet β”‚ 4 books (scraper) β”‚ 2025 season β”‚ Baseline + park (no pitcher) β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ All providers fail β”‚ "No HR props" warning β”‚ β€” β”‚ β€” β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
---
Verification
1. Props tab loads without error
2. Multiple sportsbooks appear in Book column
3. Model HR% column shows percentages for most players
4. Source column shows enriched source string (e.g. "baseline+pitcher_quality+park")
5. Check logs: [fetch_all_upcoming_hr_props] unique_books= shows 2+ books
6. Check logs: [mlb_starters] games_with_starters=N shows >0 when games are announced
β•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œβ•Œ
Claude has written up a plan and is ready to execute. Would you like to proceed?
❯ 1. Yes, auto-accept edits
2. Yes, manually approve edits
3. Type here to tell Claude what to change