# SPECS.md — Panoply-Lite (Python/Gradio) Goal: A browser-based viewer for **netCDF/HDF/GRIB/Zarr** datasets with an **xarray+Dask** backend, **Cartopy** maps, multi-dimensional slicing, animations, custom color tables (CPT/ACT/RGB), map projections, **OPeNDAP/THREDDS/S3/Zarr** support, and exports (PNG/SVG/PDF/MP4). Think “Panoply in the browser.” --- ## 0) TL;DR Build Contract - **Deliverable**: `app.py` + `panlite/` package + tests + pinned `pyproject.toml` (or `requirements.txt`) + minimal sample assets. - **Runtime**: Python 3.11; CPU-only acceptable; FFmpeg required for MP4. - **UI**: Gradio Blocks (single page). - **Perf**: Lazy I/O with Dask; responsive for slices ≤ ~5e6 elements. - **Outcomes (Definition of Done)**: Load ERA5/CMIP-like datasets; produce a global map, a time–latitude section, a 1D line plot, an animation (MP4), and an A–B difference contour; export PNG/SVG/PDF; save/load view state JSON. --- ## 1) Scope ### 1.1 MVP Features - **Open sources**: local files; `https://` (incl. **OPeNDAP/THREDDS**); `s3://` (anonymous or creds); `zarr://` (local/remote). - **Formats/engines**: - netCDF/HDF5 → `xarray.open_dataset(..., engine="h5netcdf" | "netcdf4")` - GRIB → `cfgrib` (requires ecCodes) - Zarr → `xarray.open_zarr(...)` - **Discovery**: list variables, dims, shapes, dtypes, attrs; CF axis inference (lat, lon, time, level). - **Slicing**: choose X/Y axes; set index/value selectors for remaining dims (nearest when numeric); time selectors. - **Plots**: - 1D: line (any single dim) - 2D: image/contour (any two dims) - Map: lon–lat georeferenced 2D with Cartopy (projections, coastlines, gridlines) - Sections: time–lat, time–lev, lon–lev (pick dims) - **Combine**: sum/avg/diff of two variables on a common grid (with reindex/broadcast; error out if CRS differs). - **Colors**: matplotlib colormaps + load **CPT/ACT/RGB** tables. - **Overlays**: coastlines, borders; optional land/ocean masks. - **Export**: PNG/SVG/PDF; **MP4** (or frames) for animations. - **State**: save/load a JSON “view state.” ### 1.2 v1.1 Stretch (optional) - **KMZ** export (ground overlay tiles). - **GeoTIFF** export for georeferenced 2D arrays. - **Trajectory** plots (CF trajectories). - **Zonal mean** helper (lat/lon aggregation). - **xESMF** regridding for combine. ### 1.3 Non-Goals (v1) 3D volume rendering; advanced reprojection pipelines; interactive WebGL. --- ## 2) Architecture & Layout panoply-lite/ app.py panlite/ __init__.py io.py # open/close; engine select; cache; variable listing grid.py # alignment, simple reindex/broadcast; combine ops; sections plot.py # 1D/2D/map plotting; exports anim.py # animate over a dim -> MP4/frames colors.py # CPT/ACT/RGB loaders -> matplotlib Colormap state.py # view-state (serialize/deserialize; schema validate) utils.py # CF axis guess; CRS helpers; small helpers assets/ colormaps/ # sample .cpt/.act/.rgb tests/ test_io.py test_plot.py test_anim.py data/ # tiny sample datasets (≤5 MB) pyproject.toml (or requirements.txt) README.md SPECS.md --- ## 3) Dependencies (pin reasonably) - Core: xarray, dask[complete], numpy, pandas, fsspec, s3fs, zarr, h5netcdf, cfgrib, eccodes - Geo: cartopy, pyproj, rioxarray - Viz: matplotlib - Web UI: gradio>=4 - Misc: pydantic, orjson, pytest, ruff - System: ffmpeg in PATH for MP4 requirements.txt example: xarray dask[complete] fsspec s3fs zarr h5netcdf cfgrib eccodes rioxarray cartopy pyproj matplotlib gradio>=4 numpy pandas pydantic orjson pytest --- ## 4) API Surface ### io.py - open_any(uri, engine=None, chunks="auto") -> DatasetHandle - list_variables(handle) -> list[VariableSpec] - get_dataarray(handle, var) -> xr.DataArray - close(handle) ### grid.py - align_for_combine(a, b, method="reindex") - combine(a, b, op="sum"|"avg"|"diff") - section(da, along: str, fixed: dict) ### plot.py - plot_1d(da, **style) -> Figure - plot_2d(da, kind="image"|"contour", **style) -> Figure - plot_map(da, proj, **style) -> Figure - export_fig(fig, fmt="png"|"svg"|"pdf", dpi=150, out_path=None) ### anim.py - animate_over_dim(da, dim: str, fps=10, out="anim.mp4") -> str ### colors.py - load_cpt(path) -> Colormap - load_act(path) -> Colormap - load_rgb(path) -> Colormap ### state.py - dump_state(dict) -> str (JSON) - load_state(str) -> dict --- ## 5) UI Spec (Gradio Blocks) Sidebar: - File open (local, URL, S3) - Dataset vars: pick A, optional B; operation (sum/avg/diff) - Axes: X, Y; slice others (sliders, dropdowns) - Plot type: 1D, 2D image, 2D contour, Map - Projection: PlateCarree, Robinson, etc - Colormap: dropdown + upload CPT/ACT/RGB - Colorbar options; vmin/vmax; levels - Animation: dim=[time|lev], FPS, MP4 export - Export: PNG/SVG/PDF - Save/load view state JSON Main Panel: - Figure canvas (matplotlib) - Metadata panel: units, attrs, dims --- ## 6) Acceptance Criteria - Open ERA5 (Zarr/OPeNDAP) dataset. - Plot: (a) global 2D map with Robinson projection; (b) time–lat section; (c) 1D time series. - Perform A–B difference contour plot. - Load custom CPT colormap and export as SVG/PDF. - Animate over time dimension (24 frames), export MP4. - Save view state, reload reproduces identical figure. --- ## 7) Testing Checklist - Local netCDF, remote OPeNDAP, public S3, Zarr open successfully. - Variable discovery works; CF axis inference correct. - 1D/2D/Map plotting functional. - Combine A±B correct (aligned grid). - Custom CPT colormap applied correctly. - Export PNG/SVG/PDF correct dimensions/DPI. - Animation over time produces correct frame count, valid MP4. - Large datasets responsive due to Dask.