Here is Claude's plan: ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ Fix: Multi-book display + Full Pre-Game Model Stack for HR% Context Props are now rendering (Caesars only, model HR% blank). Three issues to fix: Issue A — Only one sportsbook showing (Caesars) Stop-at-first provider logic returns Caesars (from Odds API) and never runs the scraper for DK/FD/BetMGM. Fix: concat-all providers, dedup by best odds per player+book+market. Issue B — Model HR% blank (pre-season statcast empty) load_statcast_recent() returns empty pre-season (no 2026 games). Name index is empty → all players return "unavailable". Fix: fall back to load_statcast_previous_season_full() (2025). Issue C — Model uses only batter baseline analytics/props_mapper.py currently applies only compute_batter_baseline(). Full pre-game model includes: park factor, pitcher quality, zone matchup, arsenal matchup, rolling form. Key data gap: (1) statcast_df is batter-perspective (player_type=batter) — no pitcher rows. (2) Props rows have no pitcher name or venue. Both must be resolved. --- Files to Modify / Create 1. data/live_prop_odds.py — concat-all with dedup (Fix A) 2. app.py — statcast fallback + pitcher statcast loader + Props call site changes (Fixes B + C) 3. data/statcast.py — add player_type param to _query_statcast() 4. data/mlb_starters.py (NEW) — probable starters from MLB Stats API 5. analytics/props_mapper.py — full pre-game model stack (Fix C) 6. visualization/props_page.py — pass pitcher_statcast_df + probable_starters through --- Fix A: data/live_prop_odds.py — concat-all with dedup Replace stop-at-first loop. Run ALL providers, concatenate, dedup keeping best odds. # CURRENT loop (stop-at-first): for provider in providers: ... if not df.empty: return normalize_prop_odds(df) # exits on first non-empty return pd.DataFrame() # NEW loop (concat-all, then dedup): frames = [] for provider in providers: try: fetch_fn = getattr(provider, "fetch_all_upcoming_hr_props", None) if fetch_fn is None: continue df = fetch_fn(sportsbooks=sportsbooks) if not df.empty: frames.append(df) except Exception as e: logger.warning(f"[odds_provider_fetch] failure: {e}", exc_info=True) continue if not frames: return pd.DataFrame() merged = pd.concat(frames, ignore_index=True) merged = normalize_prop_odds(merged) # Dedup: keep one row per (player_name, sportsbook_key, market) — best odds wins if not merged.empty and "sportsbook_key" in merged.columns: merged["_odds_score"] = merged["odds_american"].apply( lambda x: int(x) if pd.notna(x) else -9999 ) merged = ( merged .sort_values("_odds_score", ascending=False) .drop_duplicates(subset=["player_name", "sportsbook_key", "market"], keep="first") .drop(columns=["_odds_score"]) .reset_index(drop=True) ) logger.warning( "[fetch_all_upcoming_hr_props] providers=%d frames=%d merged_rows=%d unique_books=%s", len(providers), len(frames), len(merged), sorted(merged["sportsbook"].dropna().unique().tolist()) if not merged.empty else [], ) return merged --- Fix B: app.py — statcast fallback for pre-season (line 3409 area) # CURRENT: render_props(load_statcast_recent(), conn=conn, raw_props=load_upcoming_hr_props()) # NEW: _statcast_for_props = load_statcast_recent() if _statcast_for_props.empty: _statcast_for_props = load_statcast_previous_season_full() --- Fix C part 1: data/statcast.py — add player_type param _query_statcast() currently hardcodes "player_type": "batter". Add a parameter so pitcher perspective can be fetched separately. def _query_statcast(start_date: str, end_date: str, season: str, player_type: str = "batter") -> pd.DataFrame: params = { ... "player_type": player_type, # was hardcoded "batter" ... } fetch_statcast_range() keeps its existing signature unchanged (defaults to batter). Add new public function: def fetch_statcast_range_pitcher(start_date: str, end_date: str) -> pd.DataFrame: """Fetch pitcher-perspective Statcast (player_name = pitcher name).""" season = str(datetime.strptime(start_date, "%Y-%m-%d").year) return _query_statcast(start_date, end_date, season=season, player_type="pitcher") --- Fix C part 2: app.py — load pitcher statcast + probable starters Add alongside load_statcast_previous_season_full(): @st.cache_data(ttl=60 * 60 * 12, show_spinner=False) def load_statcast_previous_season_full_pitcher() -> pd.DataFrame: """2025 season, pitcher perspective — player_name = pitcher name.""" today = pd.Timestamp.utcnow().date() previous_year = today.year - 1 start_date = pd.Timestamp(year=previous_year, month=1, day=1).date() end_date = pd.Timestamp(year=previous_year, month=12, day=31).date() from data.statcast import fetch_statcast_range_pitcher from data.statcast import normalize_statcast from features.pitch_features import add_pitch_features raw = fetch_statcast_range_pitcher(start_date.isoformat(), end_date.isoformat()) normalized = normalize_statcast(raw) return add_pitch_features(normalized) Update the Props call site: _statcast_for_props = load_statcast_recent() if _statcast_for_props.empty: _statcast_for_props = load_statcast_previous_season_full() _pitcher_statcast = load_statcast_previous_season_full_pitcher() from data.mlb_starters import fetch_probable_starters_for_props _probable_starters = fetch_probable_starters_for_props() # {(away_team, home_team): {...}} render_props( _statcast_for_props, conn=conn, raw_props=load_upcoming_hr_props(), pitcher_statcast_df=_pitcher_statcast, probable_starters=_probable_starters, ) --- Fix C part 3: data/mlb_starters.py (NEW FILE) Fetches probable starters from MLB Stats API for the next 7 days. Returns a dict mapping (away_team, home_team) -> {"home_pitcher": str, "away_pitcher": str} for use in props_mapper. Uses the public statsapi.mlb.com/api/v1/schedule endpoint with hydrate=probablePitcher. Cached in the calling layer (Streamlit cache_data). No API key required. MLB API team IDs for team name matching — response includes teams.home.team.name and teams.away.team.name alongside teams.home.probablePitcher.fullName. # data/mlb_starters.py from __future__ import annotations import logging from datetime import timedelta import requests import pandas as pd _log = logging.getLogger(__name__) _SCHEDULE_URL = "https://statsapi.mlb.com/api/v1/schedule" def fetch_probable_starters_for_props() -> dict[tuple[str, str], dict[str, str | None]]: """ Returns {(away_team, home_team): {"home_pitcher": name_or_None, "away_pitcher": name_or_None}} for all MLB games in the next 7 days. """ today = pd.Timestamp.utcnow().date() end_date = today + timedelta(days=7) params = { "sportId": 1, "startDate": today.isoformat(), "endDate": end_date.isoformat(), "hydrate": "probablePitcher", "gameType": "R,F,D,L,W", } try: r = requests.get(_SCHEDULE_URL, params=params, timeout=15) r.raise_for_status() except Exception as exc: _log.warning("[mlb_starters] schedule fetch failed: %s", exc) return {} result: dict[tuple[str, str], dict[str, str | None]] = {} for date_entry in r.json().get("dates", []): for game in date_entry.get("games", []): teams = game.get("teams", {}) away_name = teams.get("away", {}).get("team", {}).get("name", "") home_name = teams.get("home", {}).get("team", {}).get("name", "") away_pitcher = teams.get("away", {}).get("probablePitcher", {}).get("fullName") home_pitcher = teams.get("home", {}).get("probablePitcher", {}).get("fullName") if away_name and home_name: result[(away_name, home_name)] = { "home_pitcher": home_pitcher, "away_pitcher": away_pitcher, } _log.warning("[mlb_starters] games_with_starters=%d", sum(1 for v in result.values() if v["home_pitcher"] or v["away_pitcher"])) return result Note: MLB API team names (e.g. "New York Yankees") may differ from props row team names (The Odds API uses "New York Yankees" format). A fuzzy match or alias map may be needed; add _normalize_team(name) helper that lowercases + strips punctuation for comparison. --- Fix C part 4: analytics/props_mapper.py — full pre-game model stack 4a. Add HOME_TEAM_TO_STADIUM mapping (30 MLB teams → canonical stadium name) # Maps Odds API home_team names → canonical names accepted by models/stadium_lookup.resolve_stadium() HOME_TEAM_TO_STADIUM: dict[str, str] = { "Baltimore Orioles": "oriole park at camden yards", "Boston Red Sox": "fenway park", "New York Yankees": "yankee stadium", "Tampa Bay Rays": "tropicana field", "Toronto Blue Jays": "rogers centre", "Chicago White Sox": "guaranteed rate field", "Cleveland Guardians": "progressive field", "Detroit Tigers": "comerica park", "Kansas City Royals": "kauffman stadium", "Minnesota Twins": "target field", "Houston Astros": "minute maid park", "Los Angeles Angels": "angel stadium", "Oakland Athletics": "athletics ballpark", "Seattle Mariners": "t-mobile park", "Texas Rangers": "globe life field", "Atlanta Braves": "truist park", "Miami Marlins": "loandepot park", "New York Mets": "citi field", "Philadelphia Phillies": "citizens bank park", "Washington Nationals": "nationals park", "Chicago Cubs": "wrigley field", "Cincinnati Reds": "great american ball park", "Milwaukee Brewers": "american family field", "Pittsburgh Pirates": "pnc park", "St. Louis Cardinals": "busch stadium", "Arizona Diamondbacks": "chase field", "Colorado Rockies": "coors field", "Los Angeles Dodgers": "dodger stadium", "San Diego Padres": "petco park", "San Francisco Giants": "oracle park", } 4b. Add batter team lookup helper def _lookup_batter_team( player_name_normalized: str, props_row_away_team: str, props_row_home_team: str, statcast_df: pd.DataFrame, ) -> str | None: """ Returns "home" or "away" for which team the batter plays on, or None if unknown. Uses statcast game records: find rows where this player appears and check if the game's away_team or home_team matches the props row matchup. """ # Try to match batter's statcast rows against the props game teams if statcast_df.empty or "player_name" not in statcast_df.columns: return None # (implementation: filter statcast by player_name, check home_team/away_team columns) 4c. Replace _get_pregame_context_adjustments() with _get_full_pregame_adjustments() New signature: def _get_full_pregame_adjustments( props_row: Any, batter_features: dict, # already computed batter baseline features statcast_df: pd.DataFrame, # batter-perspective (player_name = batter) pitcher_statcast_df: pd.DataFrame, # pitcher-perspective (player_name = pitcher) probable_starters: dict, # {(away_team, home_team): {home_pitcher, away_pitcher}} ) -> tuple[float, str]: # (total_adj, source_detail_str) Returns total additive HR probability adjustment and a pipe-separated source string (e.g. "baseline+pitcher_quality+park+rolling_form+zone_matchup+arsenal_matchup"). Components applied in order, each try/except, no-op on failure: 1. Park factor — via HOME_TEAM_TO_STADIUM[home_team] → resolve_stadium() → compute_park_adjustment() → clamp to ±0.015 2. Probable pitcher lookup — match (away_team, home_team) in probable_starters dict (fuzzy normalize both sides). Determine batter's team → select opposing starting pitcher. 3. Pitcher quality (proper) — build_pitcher_feature_row(pitcher_statcast_df, pitcher_name) → compute_pitcher_adjustment(batter_features, pitcher_row) from models/pitcher_adjustment.py. Use result's hr_adj directly. Clamp ±0.010. 4. Zone matchup (if data available) — build_batter_zone_feature_row(statcast_df, player_name) (from remote DB via batter_zone_store) + build_pitcher_zone_feature_row(pitcher_statcast_df, pitcher_name) → compute_zone_matchup_adjustment(). Use hr_zone_boost - baseline_hr delta, clamped ±0.010. 5. Arsenal matchup — build_batter_arsenal_feature_row(statcast_df, statcast_name) - build_pitcher_arsenal_feature_row(pitcher_statcast_df, pitcher_name) → compute_arsenal_matchup_adjustment(). Use arsenal_hr_boost delta, clamped ±0.010. 6. Rolling form — build_batter_rolling_form_row(statcast_df, statcast_name, reference_date=today) - build_pitcher_rolling_form_row(pitcher_statcast_df, pitcher_name, reference_date=today) → compute_upcoming_rolling_adjustment(batter_roll, pitcher_roll, batter_features, pitcher_row). Use rolling_hr_adjustment directly. 4d. Update map_hr_props_to_model() signature def map_hr_props_to_model( props_df: pd.DataFrame, statcast_df: pd.DataFrame, prob_fn: ... | None = None, pitcher_stats_df: pd.DataFrame | None = None, # existing param (kept) pitcher_statcast_df: pd.DataFrame | None = None, # NEW: pitcher-perspective statcast probable_starters: dict | None = None, # NEW: {(away,home): {pitchers}} ) -> pd.DataFrame: Inside the per-row loop, replace _get_pregame_context_adjustments() call with _get_full_pregame_adjustments() call using the new params. --- Fix C part 5: visualization/props_page.py — pass new params Update render_props() signature to accept and forward: def render_props( statcast_df: pd.DataFrame, conn=None, raw_props: pd.DataFrame | None = None, pitcher_statcast_df: pd.DataFrame | None = None, # NEW probable_starters: dict | None = None, # NEW ) -> None: At line 107: # CURRENT: mapped = map_hr_props_to_model(filtered_raw, statcast_df) # NEW: mapped = map_hr_props_to_model( filtered_raw, statcast_df, pitcher_statcast_df=pitcher_statcast_df, probable_starters=probable_starters, ) --- Expected Behavior After All Fixes ┌───────────────────────────────────────┬────────────────────────────────┬─────────────────┬──────────────────────────────┐ │ Scenario │ Books shown │ Model HR% │ Model components │ ├───────────────────────────────────────┼────────────────────────────────┼─────────────────┼──────────────────────────────┤ │ Season in progress, all books posting │ All books (Odds API + scraper) │ Recent statcast │ All pre-game models │ ├───────────────────────────────────────┼────────────────────────────────┼─────────────────┼──────────────────────────────┤ │ Pre-season, starters announced │ 4 books (scraper) │ 2025 season │ All pre-game models │ ├───────────────────────────────────────┼────────────────────────────────┼─────────────────┼──────────────────────────────┤ │ Pre-season, no starters yet │ 4 books (scraper) │ 2025 season │ Baseline + park (no pitcher) │ ├───────────────────────────────────────┼────────────────────────────────┼─────────────────┼──────────────────────────────┤ │ All providers fail │ "No HR props" warning │ — │ — │ └───────────────────────────────────────┴────────────────────────────────┴─────────────────┴──────────────────────────────┘ --- Verification 1. Props tab loads without error 2. Multiple sportsbooks appear in Book column 3. Model HR% column shows percentages for most players 4. Source column shows enriched source string (e.g. "baseline+pitcher_quality+park") 5. Check logs: [fetch_all_upcoming_hr_props] unique_books= shows 2+ books 6. Check logs: [mlb_starters] games_with_starters=N shows >0 when games are announced ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ Claude has written up a plan and is ready to execute. Would you like to proceed? ❯ 1. Yes, auto-accept edits 2. Yes, manually approve edits 3. Type here to tell Claude what to change