ncview / SPECS.md
Nipun's picture
🌍 TensorView v1.0 - Complete NetCDF/HDF/GRIB viewer
433dab5
# SPECS.md — Panoply-Lite (Python/Gradio)
Goal: A browser-based viewer for **netCDF/HDF/GRIB/Zarr** datasets with an **xarray+Dask** backend, **Cartopy** maps, multi-dimensional slicing, animations, custom color tables (CPT/ACT/RGB), map projections, **OPeNDAP/THREDDS/S3/Zarr** support, and exports (PNG/SVG/PDF/MP4). Think “Panoply in the browser.”
---
## 0) TL;DR Build Contract
- **Deliverable**: `app.py` + `panlite/` package + tests + pinned `pyproject.toml` (or `requirements.txt`) + minimal sample assets.
- **Runtime**: Python 3.11; CPU-only acceptable; FFmpeg required for MP4.
- **UI**: Gradio Blocks (single page).
- **Perf**: Lazy I/O with Dask; responsive for slices ≤ ~5e6 elements.
- **Outcomes (Definition of Done)**: Load ERA5/CMIP-like datasets; produce a global map, a time–latitude section, a 1D line plot, an animation (MP4), and an A–B difference contour; export PNG/SVG/PDF; save/load view state JSON.
---
## 1) Scope
### 1.1 MVP Features
- **Open sources**: local files; `https://` (incl. **OPeNDAP/THREDDS**); `s3://` (anonymous or creds); `zarr://` (local/remote).
- **Formats/engines**:
- netCDF/HDF5 → `xarray.open_dataset(..., engine="h5netcdf" | "netcdf4")`
- GRIB → `cfgrib` (requires ecCodes)
- Zarr → `xarray.open_zarr(...)`
- **Discovery**: list variables, dims, shapes, dtypes, attrs; CF axis inference (lat, lon, time, level).
- **Slicing**: choose X/Y axes; set index/value selectors for remaining dims (nearest when numeric); time selectors.
- **Plots**:
- 1D: line (any single dim)
- 2D: image/contour (any two dims)
- Map: lon–lat georeferenced 2D with Cartopy (projections, coastlines, gridlines)
- Sections: time–lat, time–lev, lon–lev (pick dims)
- **Combine**: sum/avg/diff of two variables on a common grid (with reindex/broadcast; error out if CRS differs).
- **Colors**: matplotlib colormaps + load **CPT/ACT/RGB** tables.
- **Overlays**: coastlines, borders; optional land/ocean masks.
- **Export**: PNG/SVG/PDF; **MP4** (or frames) for animations.
- **State**: save/load a JSON “view state.”
### 1.2 v1.1 Stretch (optional)
- **KMZ** export (ground overlay tiles).
- **GeoTIFF** export for georeferenced 2D arrays.
- **Trajectory** plots (CF trajectories).
- **Zonal mean** helper (lat/lon aggregation).
- **xESMF** regridding for combine.
### 1.3 Non-Goals (v1)
3D volume rendering; advanced reprojection pipelines; interactive WebGL.
---
## 2) Architecture & Layout
panoply-lite/
app.py
panlite/
__init__.py
io.py # open/close; engine select; cache; variable listing
grid.py # alignment, simple reindex/broadcast; combine ops; sections
plot.py # 1D/2D/map plotting; exports
anim.py # animate over a dim -> MP4/frames
colors.py # CPT/ACT/RGB loaders -> matplotlib Colormap
state.py # view-state (serialize/deserialize; schema validate)
utils.py # CF axis guess; CRS helpers; small helpers
assets/
colormaps/ # sample .cpt/.act/.rgb
tests/
test_io.py
test_plot.py
test_anim.py
data/ # tiny sample datasets (≤5 MB)
pyproject.toml (or requirements.txt)
README.md
SPECS.md
---
## 3) Dependencies (pin reasonably)
- Core: xarray, dask[complete], numpy, pandas, fsspec, s3fs, zarr, h5netcdf, cfgrib, eccodes
- Geo: cartopy, pyproj, rioxarray
- Viz: matplotlib
- Web UI: gradio>=4
- Misc: pydantic, orjson, pytest, ruff
- System: ffmpeg in PATH for MP4
requirements.txt example:
xarray
dask[complete]
fsspec
s3fs
zarr
h5netcdf
cfgrib
eccodes
rioxarray
cartopy
pyproj
matplotlib
gradio>=4
numpy
pandas
pydantic
orjson
pytest
---
## 4) API Surface
### io.py
- open_any(uri, engine=None, chunks="auto") -> DatasetHandle
- list_variables(handle) -> list[VariableSpec]
- get_dataarray(handle, var) -> xr.DataArray
- close(handle)
### grid.py
- align_for_combine(a, b, method="reindex")
- combine(a, b, op="sum"|"avg"|"diff")
- section(da, along: str, fixed: dict)
### plot.py
- plot_1d(da, **style) -> Figure
- plot_2d(da, kind="image"|"contour", **style) -> Figure
- plot_map(da, proj, **style) -> Figure
- export_fig(fig, fmt="png"|"svg"|"pdf", dpi=150, out_path=None)
### anim.py
- animate_over_dim(da, dim: str, fps=10, out="anim.mp4") -> str
### colors.py
- load_cpt(path) -> Colormap
- load_act(path) -> Colormap
- load_rgb(path) -> Colormap
### state.py
- dump_state(dict) -> str (JSON)
- load_state(str) -> dict
---
## 5) UI Spec (Gradio Blocks)
Sidebar:
- File open (local, URL, S3)
- Dataset vars: pick A, optional B; operation (sum/avg/diff)
- Axes: X, Y; slice others (sliders, dropdowns)
- Plot type: 1D, 2D image, 2D contour, Map
- Projection: PlateCarree, Robinson, etc
- Colormap: dropdown + upload CPT/ACT/RGB
- Colorbar options; vmin/vmax; levels
- Animation: dim=[time|lev], FPS, MP4 export
- Export: PNG/SVG/PDF
- Save/load view state JSON
Main Panel:
- Figure canvas (matplotlib)
- Metadata panel: units, attrs, dims
---
## 6) Acceptance Criteria
- Open ERA5 (Zarr/OPeNDAP) dataset.
- Plot: (a) global 2D map with Robinson projection; (b) time–lat section; (c) 1D time series.
- Perform A–B difference contour plot.
- Load custom CPT colormap and export as SVG/PDF.
- Animate over time dimension (24 frames), export MP4.
- Save view state, reload reproduces identical figure.
---
## 7) Testing Checklist
- Local netCDF, remote OPeNDAP, public S3, Zarr open successfully.
- Variable discovery works; CF axis inference correct.
- 1D/2D/Map plotting functional.
- Combine A±B correct (aligned grid).
- Custom CPT colormap applied correctly.
- Export PNG/SVG/PDF correct dimensions/DPI.
- Animation over time produces correct frame count, valid MP4.
- Large datasets responsive due to Dask.