Spaces:

VibecoderMcSwaggins
/

stroke-viewer-frontend

Running

App Files Files Community

stroke-viewer-frontend / docs /specs /23-slice-comparison-overlay-bug.md

VibecoderMcSwaggins

fix(ui): prediction overlay invisible, race condition, thread safety (#23) (#23)

987c4be unverified 7 days ago

preview code

raw

history blame

8.34 kB

Bug Investigation: Slice Comparison Prediction Overlay Not Visible

Issue: Prediction overlay is invisible in slice comparison while ground truth overlay is visible

Date: 2025-12-09 Branch: debug/slice-comparison-prediction-overlay

Observed Behavior

In the Gradio UI "Slice Comparison" tab:

DWI Input (left panel): Shows grayscale brain scan ✓
Prediction (middle panel): Shows grayscale brain scan without any visible overlay ✗
Ground Truth (right panel): Shows grayscale brain scan with green overlay ✓

Expected Behavior

The Prediction panel should show a red overlay on the predicted lesion area, similar to how Ground Truth shows a green overlay.

Code Analysis

Visualization Code (`viewer.py:261-268`)

# Prediction panel
axes[1].imshow(d_slice, cmap="gray")
axes[1].imshow(
    np.ma.masked_where(p_slice == 0, p_slice),
    cmap="Reds",
    alpha=0.5,
    vmin=0,
    vmax=1,
)

Ground Truth Code (`viewer.py:273-280`)

# Ground Truth panel
axes[2].imshow(d_slice, cmap="gray")
axes[2].imshow(
    np.ma.masked_where(g_slice == 0, g_slice),
    cmap="Greens",
    alpha=0.5,
    vmin=0,
    vmax=1,
)

The code is structurally identical. The only difference is:

Prediction: cmap="Reds"
Ground Truth: cmap="Greens"

Hypothesis

Primary Hypothesis: Probability vs Binary Mask Values

Mask Type	Typical Values	Colormap Rendering	Visibility
Ground Truth	Binary (0 or 1)	1.0 → Dark Green	High ✓
Prediction	Probabilities (0.0-0.3)	0.1 → Nearly White	None ✗

Why this matters:

Matplotlib's "Reds" colormap goes from white (0) → red (1)
With vmin=0, vmax=1:
- A value of 0.05 maps to 5% of the colormap = nearly white
- A value of 1.0 maps to 100% of the colormap = red
With alpha=0.5 over a grayscale background, nearly-white overlays are invisible

Evidence:

DeepISLES SEALS model may output probability maps, not binary masks
The compute_dice function in metrics.py applies a threshold=0.5 to binarize predictions
The visualization does not apply any thresholding before display

Alternative Hypotheses

Empty slice: Prediction mask is all zeros at the selected slice (unlikely given the slice selection logic uses get_slice_at_max_lesion(prediction_path))
Data type issue: Float comparison p_slice == 0 may fail for float32 arrays (unlikely - works for ground truth)
File path mismatch: Wrong file being loaded as prediction (need to verify)

Diagnostic Steps

1. Check Prediction Mask Values

import nibabel as nib
import numpy as np

# Load a prediction mask from a recent run
pred = nib.load("/path/to/prediction.nii.gz").get_fdata()
print(f"Shape: {pred.shape}")
print(f"Dtype: {pred.dtype}")
print(f"Min: {pred.min()}, Max: {pred.max()}")
print(f"Unique values: {np.unique(pred)[:20]}")  # First 20 unique values
print(f"Non-zero count: {np.count_nonzero(pred)}")
print(f"Values > 0.5: {np.count_nonzero(pred > 0.5)}")

2. Check Ground Truth Mask Values

gt = nib.load("/path/to/ground_truth.nii.gz").get_fdata()
print(f"Shape: {gt.shape}")
print(f"Dtype: {gt.dtype}")
print(f"Min: {gt.min()}, Max: {gt.max()}")
print(f"Unique values: {np.unique(gt)}")

3. Visual Comparison

# Plot histogram of values
import matplotlib.pyplot as plt
fig, axes = plt.subplots(1, 2)
axes[0].hist(pred[pred > 0].flatten(), bins=50)
axes[0].set_title("Prediction non-zero values")
axes[1].hist(gt[gt > 0].flatten(), bins=50)
axes[1].set_title("Ground Truth non-zero values")
plt.savefig("mask_histograms.png")

Proposed Fix

Option A: Binarize Prediction Before Display (Recommended)

# In render_slice_comparison, before creating overlay:
p_slice_binary = (p_slice > 0.5).astype(float)

axes[1].imshow(
    np.ma.masked_where(p_slice_binary == 0, p_slice_binary),
    cmap="Reds",
    alpha=0.5,
    vmin=0,
    vmax=1,
)

Pros:

Consistent with how compute_dice treats predictions
Clear visualization of model decision boundary
Matches clinical interpretation (lesion vs not-lesion)

Cons:

Loses probability information in visualization

Option B: Dynamic Normalization

# Normalize to actual value range instead of fixed 0-1
p_max = p_slice.max() if p_slice.max() > 0 else 1.0
axes[1].imshow(
    np.ma.masked_where(p_slice == 0, p_slice),
    cmap="Reds",
    alpha=0.5,
    vmin=0,
    vmax=p_max,
)

Pros:

Shows probability information
Works regardless of value range

Cons:

Inconsistent intensity across cases
Low-confidence predictions still appear bright (misleading)

Option C: Threshold-Based Masking

# Only show values above a threshold
threshold = 0.5
axes[1].imshow(
    np.ma.masked_where(p_slice < threshold, p_slice),
    cmap="Reds",
    alpha=0.5,
    vmin=threshold,
    vmax=1.0,
)

Pros:

Only shows confident predictions
Good dynamic range for visible values

Cons:

May hide uncertain but potentially relevant areas

Recommendation

Implement Option A (Binarize) because:

It matches the clinical use case (segmentation → binary decision)
It's consistent with compute_dice threshold behavior
It provides clear, interpretable visualization
The raw probability map can still be viewed in NiiVue if needed

Dependencies

Package	Version	Relevant
gradio	>=6.0.0	Unlikely cause (renders matplotlib figure correctly)
matplotlib	>=3.8.0	Colormap behavior is standard
numpy	>=1.26.0,<2.0.0	Float comparison works correctly
nibabel	>=5.2.0	Loads data correctly

Resolution

Status: FIXED (2025-12-09) Branch: debug/slice-comparison-prediction-overlay

Changes Made

Primary Fix (Issue #23):

viewer.py:270-275: Added binarization of prediction mask in render_slice_comparison:

# Binarize prediction at threshold 0.5 for visible overlay (Issue #23)
p_slice_binary = (p_slice > 0.5).astype(float)

viewer.py:156-164: Added binarization in render_3panel_view for consistency
tests/conftest.py: Added synthetic_probability_mask and synthetic_binary_mask fixtures
tests/ui/test_viewer.py: Added TestRenderSliceComparisonProbabilityMask test class

Additional Fixes (Found During Audit):

Race Condition (P2): Replaced global _previous_results_dir with gr.State for per-session thread-safe cleanup tracking
Inconsistent Threshold in compute_volume_ml: Added threshold=0.5 parameter for consistent binarization
render_3panel_view Wired Into UI:
- Added gr.Tabs layout with "Interactive 3D" and "Static Report" tabs
- render_3panel_view now displayed in "Static Report" alongside slice comparison
- Provides WebGL2 fallback via static matplotlib figures
Thread-Safe Matplotlib: Refactored from pyplot API to Object-Oriented API (Figure()) for multi-user safety

Verification

All 136 tests pass
Lint (ruff) passes
Type check (mypy) passes

Files Modified

File	Changes
`src/stroke_deepisles_demo/ui/viewer.py`	OO matplotlib API, binarization in both render functions
`src/stroke_deepisles_demo/ui/app.py`	gr.State, render_3panel_view integration, volume_ml
`src/stroke_deepisles_demo/ui/components.py`	Tabs layout (Interactive 3D / Static Report)
`src/stroke_deepisles_demo/metrics.py`	threshold parameter for compute_volume_ml
`tests/conftest.py`	New probability/binary mask fixtures
`tests/ui/test_viewer.py`	Probability mask tests
`tests/ui/test_app.py`	Updated for new return signature

Next Steps

Run diagnostic script to confirm hypothesis
Implement fix (Option A - binarize)
Add test case for probability-valued masks
Wire render_3panel_view into UI with tabs
Fix race condition with gr.State
Make matplotlib thread-safe with OO API
Verify fix in local Gradio app (manual testing recommended)
Create PR and merge to main

Bug Investigation: Slice Comparison Prediction Overlay Not Visible

Observed Behavior

Expected Behavior

Code Analysis

Visualization Code (viewer.py:261-268)

Ground Truth Code (viewer.py:273-280)

Hypothesis

Primary Hypothesis: Probability vs Binary Mask Values

Alternative Hypotheses

Diagnostic Steps

1. Check Prediction Mask Values

2. Check Ground Truth Mask Values

3. Visual Comparison

Proposed Fix

Option A: Binarize Prediction Before Display (Recommended)

Option B: Dynamic Normalization

Option C: Threshold-Based Masking

Recommendation

Dependencies

Resolution

Changes Made

Verification

Files Modified

Next Steps

Visualization Code (`viewer.py:261-268`)

Ground Truth Code (`viewer.py:273-280`)