""" Analysis Guide Tool ==================== Provides methodological guidance for climate data analysis using python_repl. This tool returns TEXT INSTRUCTIONS (not executable code!) for: - What approach to take - How to structure the analysis - Quality checks and pitfalls - Best practices for visualization The agent uses python_repl to execute the actual analysis. """ from typing import Literal from pydantic import BaseModel, Field from langchain_core.tools import StructuredTool # ============================================================================= # ANALYSIS GUIDES # ============================================================================= ANALYSIS_GUIDES = { # ------------------------------------------------------------------------- # DATA OPERATIONS # ------------------------------------------------------------------------- "load_data": """ ## Loading ERA5 Data ### When to use - Initializing any analysis - Loading downloaded Zarr data ### Workflow 1. **Load data** — Use `xr.open_dataset('path', engine='zarr')` or `xr.open_zarr('path')`. 2. **Inspect dataset** — Check coordinates and available variables. 3. **Convert units** before any analysis: - Temp (`t2`, `d2`, `skt`, `sst`, `stl1`): subtract 273.15 → °C - Precip (`tp`, `cp`, `lsp`): multiply by 1000 → mm - Pressure (`sp`, `mslp`): divide by 100 → hPa ### Quality Checklist - [ ] Data loaded lazily (avoid `.load()` on large datasets) - [ ] Units converted before aggregations - [ ] Coordinate names verified (latitude vs lat, etc.) ### Common Pitfalls - ⚠️ Loading multi-year global data into memory causes OOM. Keep operations lazy until subsetted. - ⚠️ Some Zarr stores have `valid_time` instead of `time` — check with `.coords`. - ⚠️ CRITICAL — LONGITUDE WRAPPING: ERA5 natively uses 0-360° longitudes. If your region is in the Western Hemisphere (Americas, Atlantic) or crosses the Prime Meridian, you MUST convert longitudes to -180/+180 BEFORE slicing or plotting. Use `ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180)).sortby('longitude')`. Failing to do this causes maps to be 95% blank space with data crushed into a tiny sliver. - ⚠️ UNIT SAFETY: When computing temperature DIFFERENCES or ANOMALIES, do NOT subtract 273.15 from the result. A temperature difference in Kelvin is numerically identical to a difference in °C. Subtracting 273.15 from an anomaly produces absurd ±200°C values. """, "spatial_subset": """ ## Spatial Subsetting ### When to use - Focusing on a specific region, country, or routing bounding box - Reducing data size before heavy analysis ### Workflow 1. **Determine bounds** — Find min/max latitude and longitude. 2. **Check coordinate orientation** — ERA5 latitude is often descending (90 to -90). 3. **Slice data** — `.sel(latitude=slice(north, south), longitude=slice(west, east))`. ### Quality Checklist - [ ] Latitude sliced from North to South (max to min) for descending coords - [ ] Longitudes match dataset format (convert -180/180 ↔ 0/360 if needed) - [ ] Result is not empty — verify with `.shape` ### Common Pitfalls - ⚠️ Slicing `slice(south, north)` on descending coords → empty result. - ⚠️ Crossing the prime meridian in 0-360 coords requires concatenating two slices. - ⚠️ Use `.sel(method='nearest')` for point extraction, not exact matching. - ⚠️ If requested bounds use negative longitudes (e.g., -120 to -80 for US West Coast), ensure you have applied the longitude wrapping from `load_data` FIRST. Otherwise slicing negative values on a 0-360 dataset returns empty data. - ⚠️ Always check latitude orientation: use `slice(north, south)` if descending, `slice(south, north)` if ascending. Verify with `ds.latitude[0] > ds.latitude[-1]`. """, "temporal_subset": """ ## Temporal Subsetting & Aggregation ### When to use - Isolating specific events, months, or seasons - Downsampling hourly data to daily/monthly ### Workflow 1. **Time slice** — `.sel(time=slice('2023-01-01', '2023-12-31'))`. 2. **Filter** — Seasons: `.sel(time=ds.time.dt.season == 'DJF')`. 3. **Resample** — `.resample(time='1D').mean()` for daily means. ### Quality Checklist - [ ] Aggregation matches variable: `.mean()` for T/wind, `.sum()` for precip - [ ] Leap years handled if using day-of-year grouping ### Common Pitfalls - ⚠️ DJF wraps across years — verify start/end boundaries. - ⚠️ `.resample()` (continuous) ≠ `.groupby()` (climatological). Don't mix them up. - ⚠️ Radiation variables (`ssr`, `ssrd`) are accumulated — need differencing, not averaging. - ⚠️ Hourly data is massive. Resample to daily ('1D') or monthly ('MS') IMMEDIATELY after spatial subsetting to avoid Memory/Timeout errors. """, # ------------------------------------------------------------------------- # STATISTICAL ANALYSIS # ------------------------------------------------------------------------- "anomalies": """ ## Anomaly Analysis ### When to use - "How unusual was this period?" - Comparing current conditions to "normal" - Any "above/below average" question ### Workflow 1. **Define baseline** — ≥10 years (30 ideal). E.g. 1991-2020. 2. **Compute climatology** — `clim = ds.groupby('time.month').mean('time')`. 3. **Subtract** — `anomaly = ds.groupby('time.month') - clim`. 4. **Convert units** — Report in °C, mm, m/s (not K, m, Pa). 5. **Assess magnitude** — Compare to σ of the baseline period. ### Data Strategy for Large-Area Anomaly Maps ⚠️ For spatial areas ≥30°×30° (e.g., tropical Pacific), do NOT download 30 years one-by-one! 1. Download target period with 1 `retrieve_era5_data` call 2. Download 3-5 recent years of the same month as baseline (3-5 calls) 3. Average baseline files in `python_repl` → climatology 4. Subtract climatology from target → anomaly map A 5-year baseline is sufficient for spatial anomaly maps. ### Quality Checklist - [ ] Baseline ≥10 years (for temporal/small-area analysis; 3-5 years OK for spatial maps) - [ ] Same calendar grouping for clim and analysis - [ ] Units converted for readability - [ ] Spatial context: is anomaly regional or localized? ### Common Pitfalls - ⚠️ Short baselines amplify noise. - ⚠️ Daily climatologies with <30yr baseline are noisy → use monthly grouping. - ⚠️ Be explicit: spatial anomaly vs temporal anomaly. - ⚠️ CRITICAL MATH BUG: Anomaly in Kelvin EXACTLY EQUALS Anomaly in Celsius. DO NOT subtract 273.15 from a temperature anomaly! If you compute `(SST_K - CLIM_K) - 273.15`, you get absurd ±200°C anomalies. Either convert both to °C BEFORE subtracting, or leave the difference as-is. ### Interpretation - Positive = warmer/wetter/windier than normal. - ±1σ = common, ±2σ = unusual (5%), ±3σ = extreme (0.3%). - Maps: MUST use `RdBu_r` centered at zero via `TwoSlopeNorm`. """, "zscore": """ ## Z-Score Analysis (Standardized Anomalies) ### When to use - Comparing extremity across different variables - Standardizing for regions with different variability - Identifying statistically significant departures ### Workflow 1. **Compute baseline mean** — Grouped by month for seasonality. 2. **Compute baseline std** — Same period, same grouping. 3. **Standardize** — `z = (value - mean) / std`. ### Quality Checklist - [ ] Standard deviation is non-zero everywhere - [ ] Baseline period matches for mean and std ### Common Pitfalls - ⚠️ Precipitation is NOT normally distributed — use SPI or percentiles instead of raw Z-scores. - ⚠️ Z-scores near coastlines can be extreme due to mixed land/ocean std. ### Interpretation - Z = 0: average. ±1: normal (68%). ±2: unusual (5%). ±3: extreme (0.3%). """, "trend_analysis": """ ## Linear Trend Analysis ### When to use - "Is it getting warmer/wetter over time?" - Detecting long-term climate change signals ### Workflow 1. **Downsample** — Convert to annual/seasonal means first. 2. **Regress** — `scipy.stats.linregress` or `np.polyfit(degree=1)`. 3. **Significance** — Extract p-value for the slope. 4. **Scale** — Multiply annual slope by 10 → "per decade". ### Quality Checklist - [ ] Period ≥20-30 years for meaningful trends - [ ] Seasonal cycle removed before fitting - [ ] Significance tested (p < 0.05) - [ ] Report trend as units/decade ### Common Pitfalls - ⚠️ Trend on daily data without removing seasonality → dominated by summer/winter swings. - ⚠️ Short series have uncertain trends — report confidence intervals. - ⚠️ Autocorrelation can inflate significance — consider using Mann-Kendall test. - ⚠️ If p > 0.05, you MUST explicitly state the trend is NOT statistically significant. Do not present insignificant trends as real signals. ### Interpretation - Report as °C/decade. Use stippling on maps for significant areas. """, "eof_analysis": """ ## EOF/PCA Analysis ### When to use - Finding dominant spatial patterns (ENSO, NAO, PDO) - Dimensionality reduction of spatiotemporal data ### Workflow 1. **Deseasonalize** — Compute anomalies to remove the seasonal cycle. 2. **Latitude weighting** — Multiply by `np.sqrt(np.cos(np.deg2rad(lat)))`. 3. **Decompose** — PCA on flattened space dimensions. 4. **Reconstruct** — Map PCs back to spatial grid (EOFs). ### Quality Checklist - [ ] Seasonal cycle removed - [ ] Latitude weighting applied - [ ] Variance explained (%) calculated per mode - [ ] Physical interpretation attempted for leading modes - [ ] Maps of EOF patterns MUST include coastlines (use Cartopy) for geographic context - [ ] Variance explained (%) MUST be explicitly displayed in each plot title ### Common Pitfalls - ⚠️ Unweighted EOFs inflate polar regions artificially. - ⚠️ EOFs are mathematical constructs — not guaranteed to correspond to physical modes. ### Interpretation - EOF1: dominant spatial pattern. PC1: its temporal evolution. - If EOF1 explains >20% variance, it's highly dominant. """, "correlation_analysis": """ ## Correlation Analysis ### When to use - Spatial/temporal correlation mapping - Lead-lag analysis (e.g., SST vs downstream precipitation) - Teleconnection exploration ### Workflow 1. **Deseasonalize both variables** — Remove seasonal cycle from both. 2. **Align time coordinates** — Ensure identical time axes. 3. **Correlate** — `xr.corr(var1, var2, dim='time')`. 4. **Lead-lag** — Use `.shift(time=N)` month offsets to test delayed responses. 5. **Significance** — Compute p-values, mask insignificant areas. ### Quality Checklist - [ ] Both variables deseasonalized - [ ] p-values computed (p < 0.05 for significance) - [ ] Sample size adequate (≥30 time points) ### Common Pitfalls - ⚠️ Correlating raw data captures the seasonal cycle — everything correlates with summer. - ⚠️ Spatial autocorrelation inflates field significance — apply Bonferroni or FDR correction. ### Interpretation - R² gives variance explained. Lead-lag peak indicates response time. - Plot spatial R maps with `RdBu_r`, stipple significant areas. """, "composite_analysis": """ ## Composite Analysis ### When to use - Average conditions during El Niño vs La Niña years - Spatial fingerprint of specific extreme events - "What does the atmosphere look like when X happens?" ### Workflow 1. **Define events** — Boolean mask of times exceeding a threshold (e.g., Niño3.4 > 0.5°C). 2. **Subset data** — `.where(mask, drop=True)`. 3. **Average** — Time mean of the subset = composite. 4. **Compare** — Subtract climatological mean → composite anomaly. ### Quality Checklist - [ ] Sample size ≥10 events for robustness - [ ] Baseline climatology matches the season of the events - [ ] Significance tested via bootstrap or t-test ### Common Pitfalls - ⚠️ Compositing n=2 events → noise, not a physical signal. - ⚠️ Mixing seasons in composite (El Niño in DJF vs JJA) obscures the signal. ### Interpretation - Shows the typical anomaly expected when event occurs. - Plot with `RdBu_r` diverging colormap. Stipple significant areas. """, "diurnal_cycle": """ ## Diurnal Cycle Analysis ### When to use - Hourly variability within days (afternoon convection, nighttime cooling) - Solar radiation patterns ### Workflow 1. **Group by hour** — `ds.groupby('time.hour').mean('time')`. 2. **Convert to local time** — ERA5 is UTC. `Local = UTC + Longitude/15`. 3. **Calculate amplitude** — `diurnal_range = max('hour') - min('hour')`. ### Quality Checklist - [ ] Input data is hourly (not daily/monthly) - [ ] UTC → local time conversion applied before labeling "afternoon"/"morning" ### Common Pitfalls - ⚠️ Averaging global data by UTC hour mixes day and night across longitudes. - ⚠️ Cloud cover (`tcc`) and radiation (`ssrd`) have strong diurnal signals — always check. ### Interpretation - `blh` and `t2` peak mid-afternoon. Convective precip (`cp`) peaks late afternoon over land, early morning over oceans. """, "seasonal_decomposition": """ ## Seasonal Decomposition ### When to use - Separating the seasonal cycle from interannual variability - Visualizing how a specific year deviates from the normal curve ### Workflow 1. **Compute climatology** — `.groupby('time.month').mean('time')`. 2. **Extract anomalies** — Subtract climatology from raw data. 3. **Smooth trend** — Apply 12-month rolling mean to extract multi-year trends. ### Quality Checklist - [ ] Baseline robust (≥10 years) - [ ] Residual = raw - seasonal - trend (should be ~white noise) ### Common Pitfalls - ⚠️ Day-of-year climatologies over short baselines are noisy — smooth with 15-day window. ### Interpretation - Separates variance into: seasonal (predictable), trend (long-term), residual (weather noise). """, "spectral_analysis": """ ## Spectral Analysis ### When to use - Periodicity detection (ENSO 3-7yr, MJO 30-60d, annual/semi-annual) - Confirming suspected oscillatory behavior ### Workflow 1. **Prepare 1D series** — Spatial average or single point. 2. **Detrend** — Remove linear trend AND seasonal cycle. 3. **Compute spectrum** — `scipy.signal.welch` or `periodogram`. 4. **Plot as Period** — X-axis = 1/frequency (years or days), not raw frequency. ### Quality Checklist - [ ] No NaNs in time series (interpolate or drop) - [ ] Time coordinate evenly spaced - [ ] Seasonal cycle removed ### Common Pitfalls - ⚠️ Seasonal cycle dominates spectrum if not removed — drowns everything else. - ⚠️ Short records can't resolve low-frequency oscillations (need ≥3× the period). ### Interpretation - Peaks = dominant cycles. ENSO: 3-7yr. QBO: ~28mo. MJO: 30-60d. Annual: 12mo. """, "spatial_statistics": """ ## Spatial Statistics & Area Averaging ### When to use - Computing a single time series for a geographic region - Area-weighted means for reporting - Field significance testing ### Workflow 1. **Latitude weights** — `weights = np.cos(np.deg2rad(ds.latitude))`. 2. **Apply** — `ds.weighted(weights).mean(dim=['latitude', 'longitude'])`. 3. **Land/sea mask** — Apply if needed (e.g., ocean-only SST average). ### Quality Checklist - [ ] Latitude weighting applied BEFORE spatial averaging - [ ] Land-sea mask applied where relevant - [ ] Units preserved correctly ### Common Pitfalls - ⚠️ Unweighted averages bias toward poles (smaller grid cells over-counted). - ⚠️ Global mean SST must exclude land points. ### Interpretation - Produces physically accurate area-averaged time series. """, "multi_variable": """ ## Multi-Variable Derived Quantities ### When to use - Combining ERA5 variables for derived metrics ### Common Derivations 1. **Wind speed** — `wspd = np.sqrt(u10**2 + v10**2)` (or u100/v100 for hub-height). 2. **Wind direction** — `wdir = (270 - np.degrees(np.arctan2(v10, u10))) % 360`. 3. **Relative humidity** — From `t2` and `d2` using Magnus formula. 4. **Heat index** — Combine `t2` and `d2` (Steadman formula). 5. **Vapour transport** — `IVT ≈ tcwv * wspd` (surface proxy). 6. **Total precip check** — `tp ≈ cp + lsp`. ### Quality Checklist - [ ] Variables share identical grids (time, lat, lon) - [ ] Units matched before combining (both in °C, both in m/s, etc.) ### Common Pitfalls - ⚠️ `mean(speed) ≠ speed_of_means` — always compute speed FIRST, then average. - ⚠️ Wind direction requires proper 4-quadrant atan2, not naive arctan. ### Interpretation - Derived metrics often better represent human/environmental impact than raw fields. """, "climatology_normals": """ ## Climatology Normals (WMO Standard) ### When to use - Computing 30-year normals - Calculating "departure from normal" ### Workflow 1. **Select base period** — Standard WMO epoch: 1991-2020 (or 1981-2010). 2. **Compute monthly averages** — `normals = baseline.groupby('time.month').mean('time')`. 3. **Departure** — `departure = current.groupby('time.month') - normals`. ### Quality Checklist - [ ] Exactly 30 years used - [ ] Same months compared (don't mix Feb normals with March data) ### Common Pitfalls - ⚠️ Moving baselines make comparisons with WMO climate reports inconsistent. ### Interpretation - "Normal" = statistical baseline. Departures express how much current conditions deviate. """, # ------------------------------------------------------------------------- # CLIMATE INDICES & EXTREMES # ------------------------------------------------------------------------- "climate_indices": """ ## Climate Indices ### When to use - Assessing ENSO, NAO, PDO, AMO teleconnections - Correlating local weather with large-scale modes ### Key Indices - **ENSO (Niño 3.4)**: `sst` anomaly, 5°S-5°N, 170°W-120°W. El Niño > +0.5°C, La Niña < -0.5°C. - **NAO**: `mslp` difference, Azores High minus Icelandic Low. Positive → mild European winters. - **PDO**: Leading EOF of North Pacific `sst` (north of 20°N). 20-30yr phases. - **AMO**: Detrended North Atlantic `sst` average. ~60-70yr cycle. ### Workflow 1. **Extract region** — Use standard geographic bounds. 2. **Compute anomaly** — Area-averaged, against 30yr baseline. 3. **Smooth** — 3-to-5 month rolling mean. ### Quality Checklist - [ ] Standard geographic bounds strictly followed - [ ] Rolling mean applied to filter weather noise - [ ] Latitude-weighted area average ### Common Pitfalls - ⚠️ Without rolling mean, the index is too noisy for classification. - ⚠️ Using incorrect region bounds produces a different (invalid) index. - ⚠️ **MJO PROXY FAILURE:** Do NOT use Skin Temperature (`skt`) or SST to track the MJO over the ocean. The signal is effectively zero (~0.1°C variance). Always use Precipitation (`tp`), Total Column Water Vapour (`tcwv`), or Total Cloud Cover (`tcc`). ### Additional Indices - **IOD (Indian Ocean Dipole)**: `sst` anomaly diff between Western (50-70°E, 10°S-10°N) and Eastern (90-110°E, 10°S-0°) poles. """, "extremes": """ ## Extreme Event Analysis ### When to use - Heat/cold extremes, heavy precipitation, tail-risk assessment - Threshold exceedance frequency ### Workflow 1. **Define threshold** — Absolute (e.g., T > 35°C) or percentile-based (> 95th pctl of baseline). 2. **Create mask** — Boolean where condition is met. 3. **Count** — Sum over time for extreme days per year/month. 4. **Trend** — Check if frequency is increasing over time. ### Quality Checklist - [ ] Percentiles from robust baseline (≥30 years) - [ ] Use daily data, not monthly averages - [ ] Units converted before applying thresholds ### Common Pitfalls - ⚠️ 99th percentile on monthly averages misses true daily extremes entirely. - ⚠️ Absolute thresholds (e.g., 35°C) are region-dependent — 35°C is normal in Sahara, extreme in London. ### Interpretation - Increasing frequency of extremes = non-linear climate change impact. - Report as "N days/year exceeding threshold" or "return period shortened from X to Y years". """, "drought_analysis": """ ## Drought Analysis ### When to use - Prolonged precipitation deficits - Agricultural/hydrological impact assessment - SPI (Standardized Precipitation Index) proxy ### Workflow 1. **Extract precip** — Use `tp` in mm (×1000 from meters). 2. **Accumulate** — Rolling sums: `tp.rolling(time=3).sum()` for 3-month SPI. 3. **Standardize** — `(accumulated - mean) / std` → SPI proxy. 4. **Cross-check** — Verify with `swvl1` (soil moisture) for ground-truth. ### Quality Checklist - [ ] Monthly data used (not hourly) - [ ] Baseline ≥30 years for stable statistics - [ ] Multiple accumulation periods tested (1, 3, 6, 12 months) ### Common Pitfalls - ⚠️ Absolute precipitation deficits are meaningless in deserts — always standardize. - ⚠️ Gamma distribution fit (proper SPI) is better than raw Z-score for precip. - ⚠️ CRITICAL BASELINE LENGTH: You MUST use a minimum 30-year baseline (e.g., 1991-2020) to compute the mean and std for SPI standardization. Computing z-scores on a 5-year period (e.g., using 2020-2024 as both study and reference period) is statistically invalid and creates artificial extreme spikes. ### Interpretation - SPI < -1.0: Moderate drought. < -1.5: Severe. < -2.0: Extreme. """, "heatwave_detection": """ ## Heatwave Detection ### When to use - Identifying heatwave events using standard definitions - Assessing heat-related risk periods ### Workflow 1. **Daily data** — Must be daily resolution (resample hourly if needed). 2. **Threshold** — 90th percentile of `t2` per calendar day from baseline. 3. **Exceedance mask** — `is_hot = t2_daily > threshold_90`. 4. **Streak detection** — Find ≥3 consecutive hot days using rolling sum ≥ 3. ### Quality Checklist - [ ] Daily data (not monthly!) - [ ] `t2` converted to °C - [ ] Threshold is per-calendar-day (not a single annual value) - [ ] Duration criterion applied (≥3 days) ### Common Pitfalls - ⚠️ Monthly data — physically impossible to detect heatwaves. - ⚠️ A single hot day is not a heatwave — duration matters. - ⚠️ Nighttime temperatures (`t2` at 00/06 UTC) also matter for health impact. - ⚠️ Using a flat seasonal anomaly threshold (e.g., "Summer Mean > +2°C") is NOT a heatwave detection method. This produces unphysical spatial artifacts. Heatwaves are discrete DAILY extreme events requiring per-calendar-day thresholds. ### Marine Heatwave Extension - For ocean/SST heatwaves, use daily mean SST (not daily max). - Marine heatwaves require ≥5 consecutive days above the 90th percentile threshold. - Use a long baseline (e.g., 1991-2020) with a ±5-day smoothed calendar-day threshold. ### Interpretation - Heatwaves require BOTH intensity (high T) AND duration (consecutive days). - Report: number of events per year, mean duration, max intensity. """, "atmospheric_rivers": """ ## Atmospheric Rivers Detection ### When to use - Detecting AR events from integrated vapour transport proxy - Extreme precipitation risk at landfall ### Workflow 1. **Extract** — `tcwv` + `u10`, `v10`. 2. **Compute IVT proxy** — `ivt = tcwv * np.sqrt(u10**2 + v10**2)`. 3. **Threshold** — IVT proxy > 250 kg/m/s (approximate). 4. **Shape check** — Feature should be elongated (>2000km long, <1000km wide). ### Quality Checklist - [ ] Acknowledge this is surface-wind proxy (true IVT needs pressure-level data) - [ ] Cross-validate with heavy `tp` at landfall - [ ] Check for persistent (≥24h) plume features ### Common Pitfalls - ⚠️ Tropical moisture pools are NOT ARs — wind-speed multiplier is essential to distinguish. - ⚠️ This surface proxy underestimates true IVT — use conservative thresholds. ### Interpretation - High `tcwv` + strong directed wind at coast = extreme flood risk. - Map with `YlGnBu` for moisture intensity. """, "blocking_events": """ ## Atmospheric Blocking Detection ### When to use - Identifying persistent high-pressure blocks from MSLP - Explaining prolonged heatwaves, droughts, or cold spells ### Workflow 1. **Extract** — `mslp` in hPa (÷100 from Pa). 2. **Compute anomalies** — Daily anomalies from climatology. 3. **Detect** — Find positive anomalies > 1.5σ persisting ≥5 days. 4. **Location** — Focus on mid-to-high latitudes (40-70°N typically). ### Quality Checklist - [ ] 3-5 day rolling mean applied to filter transient ridges - [ ] Persistence criterion enforced (≥5 days) - [ ] Mid-latitude focus ### Common Pitfalls - ⚠️ Fast-moving ridges are NOT blocks — persistence is key. - ⚠️ Blocks in the Southern Hemisphere are rarer and weaker. ### Interpretation - Blocks force storms to detour, causing prolonged rain on flanks and drought/heat underneath. """, "energy_budget": """ ## Surface Energy Budget ### When to use - Analyzing radiation balance and surface heating - Solar energy potential assessment ### Workflow 1. **Extract radiation** — `ssrd` (incoming solar), `ssr` (net solar after reflection). 2. **Convert units** — J/m² to W/m² by dividing by accumulation period (3600s for hourly). 3. **Compute albedo proxy** — `albedo ≈ 1 - (ssr / ssrd)` where ssrd > 0. 4. **Seasonal patterns** — Group by month to see radiation cycle. ### Quality Checklist - [ ] Accumulation period properly accounted for (hourly vs daily sums) - [ ] Division by zero protected (nighttime ssrd = 0) - [ ] Units clearly stated: W/m² or MJ/m²/day ### Common Pitfalls - ⚠️ ERA5 radiation is ACCUMULATED over the forecast step — must difference consecutive steps for instantaneous values. - ⚠️ `ssr` already accounts for clouds and albedo — don't double-correct. ### Interpretation - Higher `ssrd` - High solar potential. Low `ssr/ssrd` ratio → high cloudiness or reflective surface (snow/ice). """, "wind_energy": """ ## Wind Energy Assessment ### When to use - Wind power density analysis - Turbine hub-height wind resource mapping ### Workflow 1. **Use hub-height winds** — `u100`, `v100` (100m, not 10m surface winds). 2. **Compute speed** — `wspd100 = np.sqrt(u100**2 + v100**2)`. 3. **Power density** — `P = 0.5 * rho * wspd100**3` where rho ≈ 1.225 kg/m³. 4. **Capacity factor** — Fraction of time wind exceeds cut-in speed (~3 m/s) and stays below cut-out (~25 m/s). 5. **Weibull fit** — Fit shape (k) and scale (A) parameters to the wind speed distribution. ### Quality Checklist - [ ] Using 100m winds, NOT 10m (turbines don't operate at surface) - [ ] Power density in W/m² - [ ] Seasonal variation checked (winter vs summer) ### Common Pitfalls - ⚠️ Using 10m winds severely underestimates wind energy potential. - ⚠️ Mean wind speed misleads — power depends on speed CUBED, so variability matters enormously. ### Interpretation - Power density >400 W/m² = excellent wind resource. - Report Weibull k parameter: k < 2 = gusty/variable, k > 3 = steady flow. """, "moisture_budget": """ ## Moisture Budget Analysis ### When to use - Understanding precipitation sources - Tracking moisture plumes and convergence zones ### Workflow 1. **Extract** — `tcwv` (precipitable water), `tcw` (total column water incl. liquid/ice). 2. **Temporal evolution** — Track `tcwv` changes to infer moisture convergence. 3. **Relate to precip** — Compare `tcwv` peaks with `tp` to see conversion efficiency. 4. **Spatial patterns** — Map `tcwv` to identify moisture corridors. ### Quality Checklist - [ ] Distinguish `tcwv` (vapour only) from `tcw` (vapour + liquid + ice) - [ ] Units: kg/m² (equivalent to mm of water) ### Common Pitfalls - ⚠️ High `tcwv` doesn't guarantee rain — need a lifting mechanism. - ⚠️ `tcw - tcwv` gives cloud water + ice content (proxy for cloud thickness). ### Interpretation - `tcwv` > 50 kg/m² in tropics = moisture-laden atmosphere primed for heavy precip. """, "convective_potential": """ ## Convective Potential (Thunderstorm Risk) ### When to use - Thunderstorm forecasting and climatology - Severe weather risk assessment ### Workflow 1. **Extract CAPE** — Already available as `cape` variable (J/kg). 2. **Classify risk** — Low (<300), Moderate (300-1000), High (1000-2500), Extreme (>2500 J/kg). 3. **Combine with moisture** — High CAPE + high `tcwv` → heavy convective storms. 4. **Check trigger** — Fronts, orography, or strong daytime heating (`t2` diurnal cycle). ### Quality Checklist - [ ] CAPE alone is insufficient — need a trigger mechanism - [ ] Check `blh` (boundary layer height) — deep BLH aids convective initiation ### Common Pitfalls - ⚠️ CAPE = potential energy, not a guarantee. High CAPE + strong capping inversion = no storms. - ⚠️ CAPE is most meaningful in afternoon hours — avoid pre-dawn values. ### Interpretation - CAPE > 1000 J/kg with deep BLH (>2km) and high `tcwv` = significant thunderstorm risk. """, "snow_cover": """ ## Snow Cover & Melt Analysis ### When to use - Tracking snow accumulation and melt timing - Climate change impacts on snowpack ### Workflow 1. **Extract** — `sd` (Snow Depth in m water equivalent). 2. **Seasonal cycle** — Track start/end of snow season per grid point. 3. **Melt timing** — Find the date when `sd` drops below threshold. 4. **Trend** — Check if snow season is shortening over decades. 5. **Compare with `stl1`/`t2`** — Warming soil accelerates melt. ### Quality Checklist - [ ] Units: meters of water equivalent - [ ] Focus on mid/high latitudes and mountain regions - [ ] Inter-annual variability large — use multi-year analysis ### Common Pitfalls - ⚠️ ERA5 snow depth is modeled, not observed — cross-reference with station data. - ⚠️ Rain-on-snow events can cause rapid melt not captured well in reanalysis. ### Interpretation - Earlier melt = less summer water supply. Map with `Blues`, reversed for snowless areas. """, # ------------------------------------------------------------------------- # VISUALIZATION # ------------------------------------------------------------------------- "visualization_spatial": """ ## Spatial Map Visualization ### When to use - Mapping absolute climate fields (Temp, Wind, Precip, Pressure) ### Workflow 1. **Figure** — `fig, ax = plt.subplots(figsize=(12, 8))`. 2. **Meshgrid** — `lons, lats = np.meshgrid(data.longitude, data.latitude)`. 3. **Plot** — `ax.pcolormesh(lons, lats, data, cmap=..., shading='auto')`. 4. **Colorbar** — ALWAYS: `plt.colorbar(mesh, ax=ax, label='Units', shrink=0.8)`. 5. **Cartopy** — Optional: add coastlines, land fill. Graceful fallback if not installed. ### Quality Checklist - [ ] Figure 12×8 for maps - [ ] Colormap matches variable: - Temp: `RdYlBu_r` | Wind: `YlOrRd` | Precip: `YlGnBu` - Pressure: `viridis` | Cloud: `Greys` | Anomalies: `RdBu_r` - [ ] NEVER use `jet` - [ ] Colorbar has label with units - [ ] CARTOPY IS MANDATORY: Always use `cartopy.crs` projections with `ax.coastlines()` and `ax.add_feature(cfeature.BORDERS)`. Maps without coastlines appear as meaningless color blobs. - [ ] Always pass `transform=ccrs.PlateCarree()` to `pcolormesh`/`contourf` when using Cartopy. - [ ] For Arctic regions (latitude > 60°N), use `ccrs.NorthPolarStereo()` instead of `PlateCarree` to avoid extreme distortion. - [ ] For US regional maps, add `cfeature.STATES` for state boundaries. - [ ] NEVER use `Greys` colormap for humidity or precipitation. Use `YlGnBu` or `BrBG`. - [ ] Categorical/binary maps (like hotspot masks) should use a categorical legend, not a continuous 0-1 colorbar. ### Common Pitfalls - ⚠️ Diverging cmap on absolute data is misleading — diverging only for anomalies. - ⚠️ Missing `shading='auto'` triggers deprecation warning. """, "visualization_timeseries": """ ## Time Series Visualization ### When to use - Temporal evolution of a variable at a point or region ### Workflow 1. **Area average** — `ts = data.mean(dim=['latitude', 'longitude'])` (with lat weighting!). 2. **Figure** — `fig, ax = plt.subplots(figsize=(10, 6))`. 3. **Raw line** — `ax.plot(ts.time, ts, linewidth=1.5)`. 4. **Smoothing** — Add rolling mean overlay with contrasting color. 5. **Date formatting** — `fig.autofmt_xdate(rotation=30)`. ### Quality Checklist - [ ] Figure 10×6 - [ ] Y-axis has explicit units - [ ] Legend included if multiple lines - [ ] Trend line if requested: dashed with slope annotation ### Enhancements - **Uncertainty band**: `ax.fill_between(time, mean-std, mean+std, alpha=0.2)` - **Event markers**: `ax.axvline(date, color='red', ls='--')` - **Twin axis**: `ax2 = ax.twinx()` for second variable - **Date formatting**: Always use proper date labels (e.g., `mdates.DateFormatter('%b %d')`), NEVER raw day-of-month integers (1, 2, ... 31). - **Y-axis range**: Do not set y-limits too narrow to artificially exaggerate peaks. Keep ranges physically reasonable. - **Dual axes coloring**: If using `ax2 = ax.twinx()`, color the y-tick labels to match the corresponding line colors. - **Grid lines**: Always add `ax.grid(True, alpha=0.3)` for precise value comparison. ### Common Pitfalls - ⚠️ Hourly data over 10+ years → unreadable block of ink. Resample to daily first. """, "visualization_anomaly_map": """ ## Anomaly Map Visualization ### When to use - Diverging data: departures, trends, z-scores - Any map that has positive AND negative values ### Workflow 1. **Center at zero** — `from matplotlib.colors import TwoSlopeNorm`. 2. **Norm** — `norm = TwoSlopeNorm(vmin=data.min(), vcenter=0, vmax=data.max())`. 3. **Plot** — `pcolormesh(..., cmap='RdBu_r', norm=norm)`. 4. **Stippling** — Overlay significance: `contourf(..., levels=[0, 0.05], hatches=['...'], colors='none')`. ### Quality Checklist - [ ] Zero is EXACTLY white/neutral in the colorbar - [ ] Warm/dry = Red; Cool/wet = Blue - [ ] Precip anomalies: consider `BrBG` instead of `RdBu_r` ### Common Pitfalls - ⚠️ Without `TwoSlopeNorm`, skewed data makes 0 appear colored → reader is misled. - ⚠️ Symmetric vmin/vmax (`vmax = max(abs(data))`) can also work but wastes color range. - ⚠️ CARTOPY IS MANDATORY for anomaly maps — always add `ax.coastlines()` and `ax.add_feature(cfeature.BORDERS)`. - ⚠️ ROBUST COLORBAR LIMITS: NEVER use raw `data.min()` and `data.max()` for anomaly map limits. A single outlier cell can result in ±200°C scale making the map unreadable. Always use percentile-based limits: `vmax = np.nanpercentile(np.abs(data), 98)`. - ⚠️ Always pass `transform=ccrs.PlateCarree()` when plotting with Cartopy. """, "visualization_wind": """ ## Wind & Vector Visualization ### When to use - Circulation patterns, wind fields, quiver/streamline plots ### Workflow 1. **Speed background** — `wspd` with `pcolormesh` + `YlOrRd`. 2. **Subsample vectors** — `skip = (slice(None, None, 5), slice(None, None, 5))` to avoid solid black. 3. **Quiver** — `ax.quiver(lons[skip], lats[skip], u[skip], v[skip], color='black')`. 4. **Alternative** — `ax.streamplot()` for flow visualization (less cluttered). ### Quality Checklist - [ ] Background heatmap shows magnitude - [ ] Vectors sparse enough to be readable - [ ] Wind barbs: `ax.barbs()` for meteorological display ### Common Pitfalls - ⚠️ Full-resolution quiver = completely black, unreadable mess. MUST subsample vectors. - ⚠️ Check arrow scaling — default autoscale can make light winds invisible. - ⚠️ REFERENCE ARROW MANDATORY: Always add `ax.quiverkey(q, 0.9, 1.05, 10, '10 m/s', labelpos='E')`. Without this, arrow magnitudes are uninterpretable. - ⚠️ CARTOPY IS MANDATORY: Add `ax.coastlines()` and `ax.add_feature(cfeature.BORDERS)` to all wind maps. - ⚠️ Always pass `transform=ccrs.PlateCarree()` to quiver/streamplot when using Cartopy. ### Interpretation - Arrows = direction, background color = magnitude. Cyclonic rotation = storm. """, "visualization_comparison": """ ## Multi-Panel Comparison ### When to use - Before/after, two periods, difference maps - Multi-variable side-by-side ### Workflow 1. **Grid** — `fig, axes = plt.subplots(1, 3, figsize=(18, 6))`. 2. **Panels 1 & 2** — Absolute values with SHARED `vmin`/`vmax`. 3. **Panel 3** — Difference (A-B) with diverging cmap centered at zero. ### Quality Checklist - [ ] Panels 1 & 2 share EXACT same vmin/vmax (otherwise visual comparison is invalid) - [ ] Panel 3 has its own divergent colorbar centered at zero - [ ] Titles clearly label what each panel shows ### Common Pitfalls - ⚠️ Auto-scaled panels = impossible to compare visually. Always lock limits. - ⚠️ Use Cartopy projections for ALL map panels: `subplot_kw={'projection': ccrs.PlateCarree()}`. Add `ax.coastlines()` to each. - ⚠️ Always pass `transform=ccrs.PlateCarree()` to each panel's plotting call. """, "visualization_profile": """ ## Hovmöller Diagrams ### When to use - Lat-time or lon-time cross-sections - Tracking wave propagation, ITCZ migration, monsoon onset ### Workflow 1. **Average out one dimension** — e.g., average across latitudes to get (lon, time). 2. **Transpose** — X=Time, Y=Lon/Lat. 3. **Plot** — `contourf` or `pcolormesh`, figure 12×6. ### Quality Checklist - [ ] X-axis uses date formatting - [ ] Y-axis labels state the averaged geographic slice - [ ] Colormap matches variable type ### Common Pitfalls - ⚠️ Swapping axes makes the diagram unintuitive. Time → X-axis convention. ### Proxy Selection for Hovmöller - ⚠️ For MJO tracking over the ocean, DO NOT use Skin Temperature (`skt`) — the signal is too weak (~0.1°C). Use Convective Precipitation (`cp`), Total Precipitation (`tp`), Total Column Water Vapour (`tcwv`), or Total Cloud Cover (`tcc`). - ⚠️ Remove the seasonal cycle (subtract 30-day running mean) to isolate intraseasonal signals like MJO (30-60 day periods). - ⚠️ Longitude axis must use standard geographic convention (-180 to +180), not 0-360. ### Interpretation - Diagonal banding = propagating waves/systems. Vertical banding = stationary patterns. """, "visualization_distribution": """ ## Distribution Visualization ### When to use - Histograms, PDFs, box plots - Comparing two time periods or regions ### Workflow 1. **Flatten** — `.values.flatten()`, drop NaNs. 2. **Shared bins** — `np.linspace(min, max, 50)`. 3. **Plot** — `ax.hist(data, bins=bins, alpha=0.5, density=True, label='Period')`. 4. **Median/mean markers** — Vertical lines with annotation. ### Quality Checklist - [ ] `density=True` for comparing different-sized samples - [ ] `alpha=0.5` for overlapping distributions - [ ] Legend when comparing multiple distributions ### Common Pitfalls - ⚠️ Raw counts (not density) skew comparison between periods with different sample sizes. - ⚠️ Too few bins = lost detail. Too many = noisy. 30-50 bins is usually good. ### Interpretation - Rightward shift = warming. Flatter + wider = more variability = more extremes. """, "visualization_animation": """ ## Animated/Sequential Maps ### When to use - Monthly/seasonal evolution of a field - Event lifecycle (genesis → peak → decay) ### Workflow 1. **Global limits** — Find absolute vmin/vmax across ALL timesteps. 2. **Multi-panel grid** — `fig, axes = plt.subplots(2, 3, figsize=(18, 12))` for 6 timesteps. 3. **Lock colorbars** — Same vmin/vmax on every panel. 4. **Shared colorbar** — Remove per-panel colorbars, add one at the bottom. ### Quality Checklist - [ ] Colorbar limits LOCKED across all panels (no jumping colors) - [ ] Timestamps clearly labeled on each panel - [ ] Static grid preferred over video (headless environment) ### Common Pitfalls - ⚠️ Auto-scaled panels flash/jump between frames — always lock limits. - ⚠️ MP4/GIF generation may fail in headless — use PNG grids instead. - ⚠️ Use Cartopy projections for ALL map panels: `subplot_kw={'projection': ccrs.PlateCarree()}`. Add `.coastlines()` to each axis. """, "visualization_dashboard": """ ## Summary Dashboard ### When to use - Comprehensive overview: map + time series + statistics in one figure - Publication-ready event summaries ### Workflow 1. **Layout** — `fig = plt.figure(figsize=(16, 10))` + `matplotlib.gridspec`. 2. **Top row** — Large spatial map (anomaly or mean field). 3. **Bottom left** — Time series of regional mean. 4. **Bottom right** — Distribution histogram or box plot. ### Quality Checklist - [ ] `plt.tight_layout()` or `constrained_layout=True` to prevent overlap - [ ] Consistent color theme across all panels - [ ] Clear panel labels (a, b, c) ### Common Pitfalls - ⚠️ Cramming too much into small figure → illegible text. Scale figure size up. - ⚠️ Different aspect ratios between map and time series need explicit gridspec ratios. - ⚠️ MIXED PROJECTION DANGER: Cartopy projections must ONLY be applied to MAP axes. If you add `projection=ccrs.PlateCarree()` to a time series or histogram panel, it will break the plot. Use `fig.add_subplot(gs[...], projection=ccrs.PlateCarree())` ONLY for spatial map panels. - ⚠️ For dashboards showing Americas/Atlantic regions (e.g., Hurricane Otis), always wrap longitudes to -180/+180 and use `ax.set_extent([west, east, south, north])`. """, "visualization_contour": """ ## Contour & Isobar Plots ### When to use - Pressure maps with isobars - Temperature isotherms - Any smoothly varying field where specific levels matter ### Workflow 1. **Define levels** — `levels = np.arange(990, 1040, 4)` for MSLP isobars. 2. **Filled contour** — `ax.contourf(lons, lats, data, levels=levels, cmap=...)`. 3. **Contour lines** — `cs = ax.contour(lons, lats, data, levels=levels, colors='black', linewidths=0.5)`. 4. **Labels** — `ax.clabel(cs, inline=True, fontsize=8)`. ### Quality Checklist - [ ] Level spacing is physically meaningful (e.g., 4 hPa for MSLP) - [ ] Contour labels don't overlap - [ ] Filled + line contours combined for best readability ### Common Pitfalls - ⚠️ Too many levels → cluttered, unreadable. 10-15 levels max. - ⚠️ Non-uniform level spacing requires manual colorbar ticks. - ⚠️ CARTOPY IS MANDATORY: Use `subplot_kw={'projection': ccrs.PlateCarree()}` and add `ax.coastlines()`. - ⚠️ Always pass `transform=ccrs.PlateCarree()` to `contour` and `contourf` calls. ### Interpretation - Tightly packed isobars = strong pressure gradient = high winds. """, "visualization_correlation_map": """ ## Spatial Correlation Maps ### When to use - Showing where a variable correlates with an index (e.g., ENSO vs global precip) - Teleconnection mapping ### Workflow 1. **Compute index** — 1D time series (e.g., Niño3.4 SST anomaly). 2. **Correlate** — `xr.corr(index, spatial_field, dim='time')` → 2D R-map. 3. **Significance** — Compute p-values from sample size and R. 4. **Plot** — Map R values with `RdBu_r` centered at zero. Stipple p < 0.05. ### Quality Checklist - [ ] Both index and field deseasonalized - [ ] R-map centered at zero (TwoSlopeNorm or symmetric limits) - [ ] Significant areas stippled or hatched - [ ] Sample size ≥30 stated ### Common Pitfalls - ⚠️ Raw data correlations dominated by shared seasonal cycle. - ⚠️ Field significance: many grid points → some will be significant by chance. Apply FDR correction. ### Interpretation - R > 0: in-phase with index. R < 0: out-of-phase. |R| > 0.5 = strong relationship. """, # ------------------------------------------------------------------------- # MARITIME ANALYSIS # ------------------------------------------------------------------------- "maritime_route": """ ## Maritime Route Risk Analysis ### When to use - Analyzing weather risks along calculated shipping lanes - Voyage planning and hazard assessment ### Workflow 1. **Route** — Call `calculate_maritime_route` → waypoints + bounding box. 2. **Data** — Download `u10`, `v10` for route bbox, target month, last 3 years. 3. **Wind speed** — `wspd = np.sqrt(u10**2 + v10**2)`. 4. **Extract** — Loop waypoints: `.sel(lat=lat, lon=lon, method='nearest')`. 5. **Risk classify** — Safe (<10), Caution (10-17), Danger (17-24), Extreme (>24 m/s). 6. **Statistics** — P95 wind speed at each waypoint, % time in each risk category. ### Quality Checklist - [ ] Bounding box from route tool used DIRECTLY (don't convert coords) - [ ] 3-year period for climatological context, not just one date - [ ] Risk categories applied at waypoint level ### Common Pitfalls - ⚠️ Global hourly downloads → timeout. Subset tightly to route bbox. - ⚠️ Don't use bounding box mean — extract AT waypoints for route-specific risk. """, "maritime_visualization": """ ## Maritime Route Risk Visualization ### When to use - Plotting route risk maps with waypoint-level risk coloring ### Workflow 1. **Background** — Map mean `wspd` with `pcolormesh` + `YlOrRd`. 2. **Route line** — Dashed line connecting waypoints. 3. **Waypoint scatter** — Color by risk: Green (<10), Amber (10-17), Coral (17-24), Red (>24 m/s). 4. **Labels** — "ORIGIN" and "DEST" annotations. 5. **Legend** — Custom 4-category legend (mandatory). ### Quality Checklist - [ ] 4-category risk legend ALWAYS included - [ ] Origin/Destination labeled - [ ] Colormap: `YlOrRd` for wind speed - [ ] Saved to PLOTS_DIR ### Common Pitfalls - ⚠️ No legend → colored dots are meaningless to the user. - ⚠️ Route line + waypoints must be on top (high zorder) to not be hidden by background. - ⚠️ CARTOPY IS MANDATORY: Always add `ax.coastlines()` — without land boundaries it is impossible to see where the route passes relative to coastlines (e.g., Suez Canal, Malacca Strait). - ⚠️ Always pass `transform=ccrs.PlateCarree()` to route scatter/line plotting calls when using Cartopy. """, } # ============================================================================= # ARGUMENT SCHEMA # ============================================================================= class AnalysisGuideArgs(BaseModel): """Arguments for analysis guide retrieval.""" topic: Literal[ # Data operations "load_data", "spatial_subset", "temporal_subset", # Statistical analysis "anomalies", "zscore", "trend_analysis", "eof_analysis", # Advanced analysis "correlation_analysis", "composite_analysis", "diurnal_cycle", "seasonal_decomposition", "spectral_analysis", "spatial_statistics", "multi_variable", "climatology_normals", # Climate indices & extremes "climate_indices", "extremes", "drought_analysis", "heatwave_detection", "atmospheric_rivers", "blocking_events", # Domain-specific "energy_budget", "wind_energy", "moisture_budget", "convective_potential", "snow_cover", # Visualization "visualization_spatial", "visualization_timeseries", "visualization_anomaly_map", "visualization_wind", "visualization_comparison", "visualization_profile", "visualization_distribution", "visualization_animation", "visualization_dashboard", "visualization_contour", "visualization_correlation_map", # Maritime "maritime_route", "maritime_visualization", ] = Field( description="Analysis topic to get guidance for" ) # ============================================================================= # TOOL FUNCTION # ============================================================================= def get_analysis_guide(topic: str) -> str: """ Get methodological guidance for climate data analysis. Returns text instructions for using python_repl to perform the analysis. """ guide = ANALYSIS_GUIDES.get(topic) if not guide: available = ", ".join(sorted(ANALYSIS_GUIDES.keys())) return f"Unknown topic: {topic}. Available: {available}" return f""" # Analysis Guide: {topic.replace('_', ' ').title()} {guide} --- Use python_repl to implement this analysis with your downloaded ERA5 data. """ # ============================================================================= # TOOL DEFINITIONS # ============================================================================= analysis_guide_tool = StructuredTool.from_function( func=get_analysis_guide, name="get_analysis_guide", description=""" Get methodological guidance for climate data analysis. Returns workflow steps, quality checklists, and pitfall warnings for: - Data: load_data, spatial_subset, temporal_subset - Statistics: anomalies, zscore, trend_analysis, eof_analysis - Advanced: correlation_analysis, composite_analysis, diurnal_cycle, seasonal_decomposition, spectral_analysis, spatial_statistics, multi_variable, climatology_normals - Climate: climate_indices, extremes, drought_analysis, heatwave_detection, atmospheric_rivers, blocking_events - Domain: energy_budget, wind_energy, moisture_budget, convective_potential, snow_cover - Visualization: visualization_spatial, visualization_timeseries, visualization_anomaly_map, visualization_wind, visualization_comparison, visualization_profile, visualization_distribution, visualization_animation, visualization_dashboard, visualization_contour, visualization_correlation_map - Maritime: maritime_route, maritime_visualization Use this BEFORE writing analysis code in python_repl. """, args_schema=AnalysisGuideArgs, ) # Visualization guide - alias for backward compatibility visualization_guide_tool = StructuredTool.from_function( func=get_analysis_guide, name="get_visualization_guide", description=""" Get publication-grade visualization instructions for ERA5 climate data. CALL THIS BEFORE creating any plot to get: - Correct colormap choices - Standard value ranges - Required map elements - Best practices Available visualization topics: - visualization_spatial: Maps with proper projections - visualization_timeseries: Time series plots - visualization_anomaly_map: Diverging anomaly maps - visualization_wind: Quiver/streamline plots - visualization_comparison: Multi-panel comparisons - visualization_profile: Hovmöller diagrams - visualization_distribution: Histograms/PDFs - visualization_animation: Sequential map grids - visualization_dashboard: Multi-panel summaries - visualization_contour: Isobar/isotherm plots - visualization_correlation_map: Spatial correlation maps - maritime_visualization: Route risk maps """, args_schema=AnalysisGuideArgs, )