Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
6.1.0
Landing Site Safety Analyzer – Architecture and Calculations
This document describes the flow in the current Gradio app (app/ui.py), from input selection through model inference, safety scoring, and UI composition.
Data and Models
- Inputs: Images under
data/Image/(VISLOC and any custom folders) vialist_all_data_inputs, with a 5% border crop (crop_nonblack) to drop black padding. Supported extensions: jpg/jpeg/png (any case). - Depth model: Depth Anything 3, cached per model id (
DepthEngine). Inference caps the long side toprocess_res_cap(default 1024) usingupper_bound_resizebefore predicting. - Segmentation model: SAM3 (
facebook/sam3) for promptable water/road/tree/roof masking. The segmenter is cached per model id but masks are recomputed every run (no output cache). Defaultsegmentation_max_sideis 512 and is clamped to the depth resolution (min 128).
Constants and Defaults
- Altitude/FOV defaults: 450 m, 90° (footprint default 10 m).
- Thresholds:
std_threshdefault 0.005,grad_threshdefault 0.1; both auto-scale with depth resolution so sliders act as base values. - Clearance factor: default 1.0 (dilates hazards by the footprint size).
- Coverage strictness: default 0.95 (fraction of the footprint that must be safe).
- Texture threshold: default 0.3 (suppresses highly textured regions).
- Depth smoothing is supported but set to 0.0 in the UI (effectively off).
- Roof mask: SAM3 promptable segmentation (default prompt:
roof), resized to depth scale and expanded to footprint size; no depth-based roof heuristics remain.
Per-Image Processing Pipeline
- Load and crop the selected image (RGB, 5% border removed).
- Depth inference: Run DA3 with long side clamped to
process_res_cap; obtaindepth_raw, then detrend withremove_global_plane. Optional Gaussian blur usesdepth_smoothing_base * res_scale(currently zero). - Footprint sizing:
fx = (W/2) / tan(FOV/2)whereWis depth width.patch_px = footprint_m * fx / altitude_m, clamped to bounds and forced odd;half_span = patch_px//2.- Visualization window
vis_patchis an odd size capped to 1/8 of the smallest depth dimension for sharper std previews.
- Texture mask: Sobel magnitude on the RGB (blurred by
patch_px/40), normalized; pixels abovetexture_thresholdare suppressed. - Segmentation masks (optional):
- Water/Road/Tree/Roof via SAM3 at
segmentation_max_side, with text prompts. Instance masks are unioned per class, resized to depth scale, and dilated to footprint size for blocking.
- Water/Road/Tree/Roof via SAM3 at
- Flat region search (
pick_flat_patch):- Normalize depth to [0,1], compute
std_mapvia box mean/mean_sq, andgrad_normvianp.gradientnormalized at the 95th percentile. - Landing mask starts from
grad_norm < grad_thresh_eff, excludes water if present, and keeps the lowest-variance patch as a fallback box.
- Normalize depth to [0,1], compute
- Safe mask construction:
- Base safe mask:
(std_map < std_thresh_eff) & (grad_norm < grad_thresh_eff) & landing_mask & texture_mask. - Apply segmentation blocks (expanded masks) to remove water/road/tree/roof regions.
- Clearance: dilate hazards by
clearance_factor * patch_px(default 1.0). - Coverage: box filter with
patch_pxwindow; keep pixels meetingcoverage_strictness(default 0.95). - Drop small components (< footprint area).
- Base safe mask:
- Center selection:
- Prefer centers where full-footprint coverage exists; choose the largest component and rank by distance transform minus flatness penalty (
openness_weight). - If no full coverage but safe pixels exist, pick the safest point inside the safe mask (distance vs. flatness).
- Fallbacks: landing mask with segmentation removed; if empty, use the flattest patch center.
- Convert depth center to image space; footprint box is scaled to image pixels and clamped to a minimum of 3 px.
- Prefer centers where full-footprint coverage exists; choose the largest component and rank by distance transform minus flatness penalty (
- Visualization layers:
- Depth colormap from
depth_raw. - Flatness std preview (
std_map_vis), gradient magnitude, gradient mask, flatness heatmap overlay. - Water/Road/Tree/Roof masks and per-class hazard overlays.
- Safety overlays: green safe heatmap, red hazard overlay from
risk_map, grayscale safety score, landing spot box/crosshair.
- Depth colormap from
Safety Heatmap and Hazards
safe_maskdrives the green overlay (alpha per pixel).- Hazard overlay uses
risk_map(max of std/grad over-threshold). Pixels aboverisk_thresholdare emphasized; water/road/tree hazards can also be overlaid separately.
Overlay Composition (compose_view)
- Base view: one of the named layers (RGB/Depth/Flatness/Gradient/Gradient mask/Water mask/Road mask/Tree mask/Safety score/Safety heatmap overlay).
- Overlays: safety heatmap, hazard heatmap, per-class hazards (water/road/tree), gradient, optional landing spot box. Fixed alpha values; toggling overlays does not rerun inference.
- Returned image is RGB.
Caching and State
- Depth model cache keyed by model id (
DepthEngine); default model is preloaded. - SAM3 models are cached per id; masks are not cached and are recomputed every run to reflect real-time cost.
images_stateholds the latest rendered layers; overlay-only changes don’t rerun inference. Prompt changes only re-trigger processing on submit/Run, not every keystroke.
User Controls and Effects
process_res_cap: depth max side (px) for DA3.footprint_m,altitude_m,fov_deg: determine footprint size in pixels.std_thresh,grad_thresh,texture_threshold: safety criteria.clearance_factor: hazard dilation (default 1.0).coverage_strictness,openness_weight: coverage tolerance and center ranking.- Segmentation toggles, prompts,
segmentation_max_side,segmentation_score_thresh,segmentation_mask_thresh: control SAM3 masks for water/road/tree. - Overlay toggles affect only display; inference results are reused.
Error Handling
- Bad/missing inputs raise Gradio errors.
- Segmentation failures warn and proceed without that mask; water/road/tree/roof warnings clarify when masks are disabled or not detected.
- Coverage/boxFilter fallbacks keep processing even if OpenCV operations fail.
Outputs
- Dict of PIL Images keyed by: RGB, Depth, Flatness map (std), Depth gradient, Gradient mask, Water mask, Road mask, Tree mask, Roof mask, Safety heatmap overlay, Hazard overlay, Water/Road/Tree hazard overlays, Flatness heatmap overlay, Safety score (grayscale), Landing spot overlay.
- Run summaries surface model id, process resolution, runtime, footprint size (depth + image scale), landing center, safe/hazard coverage, effective thresholds, per-mask coverage (water/road/tree/roof), and warnings for disabled/absent masks or missing safe regions; the UI cards render these fields directly.
compose_viewuses these to build the preview.