Date: February 20, 2026
This report documents the full methodology used to adapt a pretrained IVUS segmentation model into a multi-task model that performs:
The goal is to provide a self-contained technical description of model design, training behavior, threshold calibration, results, and limitations.
Given an IVUS frame x, we optimize two tasks:
M_hat: lumen mask over pixelsy_hat: bifurcation probability in [0,1]The model is trained at frame level. There is no temporal model (no recurrence, no sequence transformer, no optical flow objective).
The dataset is built from a frame-bank of manually labeled IVUS frames with train/validation/test partitions.
Split counts:
Bifurcation positive rate by split:
Lumen annotation coverage by split:
This means classification supervision is denser than segmentation supervision in the multi-task setting.

A pretrained segmentation backbone is reused as initialization.
A lightweight multi-task classification head is attached on top of segmentation logits:
This is a multi-task head, not an attention module.
The segmentation branch and classification branch share upstream representation. This encourages feature reuse while keeping task-specific outputs separate.

For each frame:
For segmentation labels, only frames with valid lumen polygons are supervised.
Let i index samples in a minibatch.
m_i in {0,1}^{H x W}: ground-truth lumen maskm_hat_i: predicted lumen probability mapy_i in {0,1}: bifurcation labely_hat_i in (0,1): bifurcation probabilityh_i in {0,1}: has-mask indicator (1 if segmentation label exists)Weighted BCE + Dice:
L_seg,i = L_wbce(m_i, m_hat_i; w_pos) + lambda_dice * L_dice(m_i, m_hat_i)
Masked batch aggregation (only labeled masks contribute):
L_seg = (sum_i h_i * L_seg,i) / (sum_i h_i + eps)
Binary cross entropy:
L_cls = (1/B) * sum_i L_bce(y_i, y_hat_i)
L_total = w_seg * L_seg + w_cls * L_cls
After model training, bifurcation threshold t is selected on validation data by grid search over candidate thresholds.
For each t:
y_hat_i^(t) = 1[y_hat_i >= t]
Compute precision, recall, F1, accuracy, etc., then choose:
t* = argmax_t F1_val(t)
The selected threshold is persisted and reused during runtime inference.

Observed behavior:

Segmentation (subset with lumen labels):
Bifurcation classification:
Confusion matrix:

Metric snapshot:


Note: compared evaluations do not use identical sample sets, so the comparison is directional.
Standalone classifier diagnostics (supporting analysis):

These plots illustrate threshold sensitivity, score separation, and calibration quality.
Train/validation/test share source pullback files (frame-level partitioning rather than source-level partitioning).
Because the model is frame-independent, this is not temporal leakage. However, repeated source style/statistics across splits can make in-domain metrics optimistic.

Only about half of samples carry segmentation labels. This creates an imbalance between classification and segmentation supervision in multi-task training.
Performance can vary substantially by source group.

This indicates a need for stronger cross-source robustness analysis.
The current multi-task head is intentionally lightweight. This helps stability and runtime cost, but may under-capture fine spatial context around bifurcation patterns.
This report is intended to be self-contained. Supporting figures are stored under docs/memo_assets/.
PDF export command:
scripts/analysis/export_memo_pdf.sh