Spaces:

yakvrz
/

drone-landing-safety

Runtime error

File size: 6,850 Bytes

0c4c32b
 
34a328a
0c4c32b
 
bcfd69e
34a328a
c5794e7
0c4c32b
34a328a
a4c7e08
 
 
34a328a
a4c7e08
 
c5794e7
0c4c32b
 
34a328a
a4c7e08
34a328a
 
 
 
a4c7e08
34a328a
c5794e7
34a328a
 
a4c7e08
34a328a
78d796a
a4c7e08
 
34a328a
 
 
 
d1989ae
34a328a
 
 
 
a4c7e08
 
 
0c4c32b
34a328a
 
a4c7e08
0c4c32b
 
a4c7e08
 
34a328a
0c4c32b
 
a4c7e08
 
34a328a
0c4c32b
34a328a
 
 
 
a4c7e08
34a328a
a4c7e08
 
0c4c32b
 
34a328a
c5794e7
34a328a
0c4c32b
 
a4c7e08
c5794e7
34a328a

# Landing Site Safety Analyzer – Architecture and Calculations

This document describes the flow in the current Gradio app (`app/ui.py`), from input selection through model inference, safety scoring, and UI composition.

## Data and Models
- **Inputs**: Images under `data/Image/` (VISLOC and any custom folders) via `list_all_data_inputs`, with a 5% border crop (`crop_nonblack`) to drop black padding. Supported extensions: jpg/jpeg/png (any case).
- **Depth model**: Depth Anything 3, cached per model id (`DepthEngine`). Inference caps the long side to `process_res_cap` (default 1024) using `upper_bound_resize` before predicting.
- **Segmentation model**: SAM3 (`facebook/sam3`) for promptable water/road/tree/roof masking. The segmenter is cached per model id but masks are recomputed every run (no output cache). Default `segmentation_max_side` is 512 and is clamped to the depth resolution (min 128).

## Constants and Defaults
- Altitude/FOV defaults: 450 m, 90° (footprint default 10 m).
- Thresholds: `std_thresh` default 0.005, `grad_thresh` default 0.1; both auto-scale with depth resolution so sliders act as base values.
- Clearance factor: default 1.0 (dilates hazards by the footprint size).
- Coverage strictness: default 0.95 (fraction of the footprint that must be safe).
- Texture threshold: default 0.3 (suppresses highly textured regions).
- Depth smoothing is supported but set to 0.0 in the UI (effectively off).
- Roof mask: SAM3 promptable segmentation (default prompt: `roof`), resized to depth scale and expanded to footprint size; no depth-based roof heuristics remain.

## Per-Image Processing Pipeline
1. **Load and crop** the selected image (RGB, 5% border removed).
2. **Depth inference**: Run DA3 with long side clamped to `process_res_cap`; obtain `depth_raw`, then detrend with `remove_global_plane`. Optional Gaussian blur uses `depth_smoothing_base * res_scale` (currently zero).
3. **Footprint sizing**:
   - `fx = (W/2) / tan(FOV/2)` where `W` is depth width.
   - `patch_px = footprint_m * fx / altitude_m`, clamped to bounds and forced odd; `half_span = patch_px//2`.
   - Visualization window `vis_patch` is an odd size capped to 1/8 of the smallest depth dimension for sharper std previews.
4. **Texture mask**: Sobel magnitude on the RGB (blurred by `patch_px/40`), normalized; pixels above `texture_threshold` are suppressed.
5. **Segmentation masks (optional)**:
   - Water/Road/Tree/Roof via SAM3 at `segmentation_max_side`, with text prompts. Instance masks are unioned per class, resized to depth scale, and dilated to footprint size for blocking.
6. **Flat region search (`pick_flat_patch`)**:
   - Normalize depth to [0,1], compute `std_map` via box mean/mean_sq, and `grad_norm` via `np.gradient` normalized at the 95th percentile.
   - Landing mask starts from `grad_norm < grad_thresh_eff`, excludes water if present, and keeps the lowest-variance patch as a fallback box.
7. **Safe mask construction**:
   - Base safe mask: `(std_map < std_thresh_eff) & (grad_norm < grad_thresh_eff) & landing_mask & texture_mask`.
   - Apply segmentation blocks (expanded masks) to remove water/road/tree/roof regions.
   - Clearance: dilate hazards by `clearance_factor * patch_px` (default 1.0).
   - Coverage: box filter with `patch_px` window; keep pixels meeting `coverage_strictness` (default 0.95).
   - Drop small components (< footprint area).
8. **Center selection**:
   - Prefer centers where full-footprint coverage exists; choose the largest component and rank by distance transform minus flatness penalty (`openness_weight`).
   - If no full coverage but safe pixels exist, pick the safest point inside the safe mask (distance vs. flatness).
   - Fallbacks: landing mask with segmentation removed; if empty, use the flattest patch center.
   - Convert depth center to image space; footprint box is scaled to image pixels and clamped to a minimum of 3 px.
9. **Visualization layers**:
   - Depth colormap from `depth_raw`.
   - Flatness std preview (`std_map_vis`), gradient magnitude, gradient mask, flatness heatmap overlay.
   - Water/Road/Tree/Roof masks and per-class hazard overlays.
   - Safety overlays: green safe heatmap, red hazard overlay from `risk_map`, grayscale safety score, landing spot box/crosshair.

## Safety Heatmap and Hazards
- `safe_mask` drives the green overlay (alpha per pixel).
- Hazard overlay uses `risk_map` (max of std/grad over-threshold). Pixels above `risk_threshold` are emphasized; water/road/tree hazards can also be overlaid separately.

## Overlay Composition (`compose_view`)
- Base view: one of the named layers (RGB/Depth/Flatness/Gradient/Gradient mask/Water mask/Road mask/Tree mask/Safety score/Safety heatmap overlay).
- Overlays: safety heatmap, hazard heatmap, per-class hazards (water/road/tree), gradient, optional landing spot box. Fixed alpha values; toggling overlays does not rerun inference.
- Returned image is RGB.

## Caching and State
- Depth model cache keyed by model id (`DepthEngine`); default model is preloaded.
- SAM3 models are cached per id; masks are not cached and are recomputed every run to reflect real-time cost.
- `images_state` holds the latest rendered layers; overlay-only changes don’t rerun inference. Prompt changes only re-trigger processing on submit/Run, not every keystroke.

## User Controls and Effects
- `process_res_cap`: depth max side (px) for DA3.
- `footprint_m`, `altitude_m`, `fov_deg`: determine footprint size in pixels.
- `std_thresh`, `grad_thresh`, `texture_threshold`: safety criteria.
- `clearance_factor`: hazard dilation (default 1.0).
- `coverage_strictness`, `openness_weight`: coverage tolerance and center ranking.
- Segmentation toggles, prompts, `segmentation_max_side`, `segmentation_score_thresh`, `segmentation_mask_thresh`: control SAM3 masks for water/road/tree.
- Overlay toggles affect only display; inference results are reused.

## Error Handling
- Bad/missing inputs raise Gradio errors.
- Segmentation failures warn and proceed without that mask; water/road/tree/roof warnings clarify when masks are disabled or not detected.
- Coverage/boxFilter fallbacks keep processing even if OpenCV operations fail.

## Outputs
- Dict of PIL Images keyed by: RGB, Depth, Flatness map (std), Depth gradient, Gradient mask, Water mask, Road mask, Tree mask, Roof mask, Safety heatmap overlay, Hazard overlay, Water/Road/Tree hazard overlays, Flatness heatmap overlay, Safety score (grayscale), Landing spot overlay.
- Run summaries surface model id, process resolution, runtime, footprint size (depth + image scale), landing center, safe/hazard coverage, effective thresholds, per-mask coverage (water/road/tree/roof), and warnings for disabled/absent masks or missing safe regions; the UI cards render these fields directly.
- `compose_view` uses these to build the preview.