File size: 5,742 Bytes
433dab5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 | # SPECS.md — Panoply-Lite (Python/Gradio)
Goal: A browser-based viewer for **netCDF/HDF/GRIB/Zarr** datasets with an **xarray+Dask** backend, **Cartopy** maps, multi-dimensional slicing, animations, custom color tables (CPT/ACT/RGB), map projections, **OPeNDAP/THREDDS/S3/Zarr** support, and exports (PNG/SVG/PDF/MP4). Think “Panoply in the browser.”
---
## 0) TL;DR Build Contract
- **Deliverable**: `app.py` + `panlite/` package + tests + pinned `pyproject.toml` (or `requirements.txt`) + minimal sample assets.
- **Runtime**: Python 3.11; CPU-only acceptable; FFmpeg required for MP4.
- **UI**: Gradio Blocks (single page).
- **Perf**: Lazy I/O with Dask; responsive for slices ≤ ~5e6 elements.
- **Outcomes (Definition of Done)**: Load ERA5/CMIP-like datasets; produce a global map, a time–latitude section, a 1D line plot, an animation (MP4), and an A–B difference contour; export PNG/SVG/PDF; save/load view state JSON.
---
## 1) Scope
### 1.1 MVP Features
- **Open sources**: local files; `https://` (incl. **OPeNDAP/THREDDS**); `s3://` (anonymous or creds); `zarr://` (local/remote).
- **Formats/engines**:
- netCDF/HDF5 → `xarray.open_dataset(..., engine="h5netcdf" | "netcdf4")`
- GRIB → `cfgrib` (requires ecCodes)
- Zarr → `xarray.open_zarr(...)`
- **Discovery**: list variables, dims, shapes, dtypes, attrs; CF axis inference (lat, lon, time, level).
- **Slicing**: choose X/Y axes; set index/value selectors for remaining dims (nearest when numeric); time selectors.
- **Plots**:
- 1D: line (any single dim)
- 2D: image/contour (any two dims)
- Map: lon–lat georeferenced 2D with Cartopy (projections, coastlines, gridlines)
- Sections: time–lat, time–lev, lon–lev (pick dims)
- **Combine**: sum/avg/diff of two variables on a common grid (with reindex/broadcast; error out if CRS differs).
- **Colors**: matplotlib colormaps + load **CPT/ACT/RGB** tables.
- **Overlays**: coastlines, borders; optional land/ocean masks.
- **Export**: PNG/SVG/PDF; **MP4** (or frames) for animations.
- **State**: save/load a JSON “view state.”
### 1.2 v1.1 Stretch (optional)
- **KMZ** export (ground overlay tiles).
- **GeoTIFF** export for georeferenced 2D arrays.
- **Trajectory** plots (CF trajectories).
- **Zonal mean** helper (lat/lon aggregation).
- **xESMF** regridding for combine.
### 1.3 Non-Goals (v1)
3D volume rendering; advanced reprojection pipelines; interactive WebGL.
---
## 2) Architecture & Layout
panoply-lite/
app.py
panlite/
__init__.py
io.py # open/close; engine select; cache; variable listing
grid.py # alignment, simple reindex/broadcast; combine ops; sections
plot.py # 1D/2D/map plotting; exports
anim.py # animate over a dim -> MP4/frames
colors.py # CPT/ACT/RGB loaders -> matplotlib Colormap
state.py # view-state (serialize/deserialize; schema validate)
utils.py # CF axis guess; CRS helpers; small helpers
assets/
colormaps/ # sample .cpt/.act/.rgb
tests/
test_io.py
test_plot.py
test_anim.py
data/ # tiny sample datasets (≤5 MB)
pyproject.toml (or requirements.txt)
README.md
SPECS.md
---
## 3) Dependencies (pin reasonably)
- Core: xarray, dask[complete], numpy, pandas, fsspec, s3fs, zarr, h5netcdf, cfgrib, eccodes
- Geo: cartopy, pyproj, rioxarray
- Viz: matplotlib
- Web UI: gradio>=4
- Misc: pydantic, orjson, pytest, ruff
- System: ffmpeg in PATH for MP4
requirements.txt example:
xarray
dask[complete]
fsspec
s3fs
zarr
h5netcdf
cfgrib
eccodes
rioxarray
cartopy
pyproj
matplotlib
gradio>=4
numpy
pandas
pydantic
orjson
pytest
---
## 4) API Surface
### io.py
- open_any(uri, engine=None, chunks="auto") -> DatasetHandle
- list_variables(handle) -> list[VariableSpec]
- get_dataarray(handle, var) -> xr.DataArray
- close(handle)
### grid.py
- align_for_combine(a, b, method="reindex")
- combine(a, b, op="sum"|"avg"|"diff")
- section(da, along: str, fixed: dict)
### plot.py
- plot_1d(da, **style) -> Figure
- plot_2d(da, kind="image"|"contour", **style) -> Figure
- plot_map(da, proj, **style) -> Figure
- export_fig(fig, fmt="png"|"svg"|"pdf", dpi=150, out_path=None)
### anim.py
- animate_over_dim(da, dim: str, fps=10, out="anim.mp4") -> str
### colors.py
- load_cpt(path) -> Colormap
- load_act(path) -> Colormap
- load_rgb(path) -> Colormap
### state.py
- dump_state(dict) -> str (JSON)
- load_state(str) -> dict
---
## 5) UI Spec (Gradio Blocks)
Sidebar:
- File open (local, URL, S3)
- Dataset vars: pick A, optional B; operation (sum/avg/diff)
- Axes: X, Y; slice others (sliders, dropdowns)
- Plot type: 1D, 2D image, 2D contour, Map
- Projection: PlateCarree, Robinson, etc
- Colormap: dropdown + upload CPT/ACT/RGB
- Colorbar options; vmin/vmax; levels
- Animation: dim=[time|lev], FPS, MP4 export
- Export: PNG/SVG/PDF
- Save/load view state JSON
Main Panel:
- Figure canvas (matplotlib)
- Metadata panel: units, attrs, dims
---
## 6) Acceptance Criteria
- Open ERA5 (Zarr/OPeNDAP) dataset.
- Plot: (a) global 2D map with Robinson projection; (b) time–lat section; (c) 1D time series.
- Perform A–B difference contour plot.
- Load custom CPT colormap and export as SVG/PDF.
- Animate over time dimension (24 frames), export MP4.
- Save view state, reload reproduces identical figure.
---
## 7) Testing Checklist
- Local netCDF, remote OPeNDAP, public S3, Zarr open successfully.
- Variable discovery works; CF axis inference correct.
- 1D/2D/Map plotting functional.
- Combine A±B correct (aligned grid).
- Custom CPT colormap applied correctly.
- Export PNG/SVG/PDF correct dimensions/DPI.
- Animation over time produces correct frame count, valid MP4.
- Large datasets responsive due to Dask.
|