Instructions to use Aditya2162/ivus-segmentation with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Keras
How to use Aditya2162/ivus-segmentation with Keras:
# Available backend options are: "jax", "torch", "tensorflow". import os os.environ["KERAS_BACKEND"] = "jax" import keras model = keras.saving.load_model("hf://Aditya2162/ivus-segmentation") - Notebooks
- Google Colab
- Kaggle
| <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang=""> | |
| <head> | |
| <meta charset="utf-8" /> | |
| <meta name="generator" content="pandoc" /> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> | |
| <title>multitask_finetuning_comprehensive_memo</title> | |
| <style> | |
| code{white-space: pre-wrap;} | |
| span.smallcaps{font-variant: small-caps;} | |
| span.underline{text-decoration: underline;} | |
| div.column{display: inline-block; vertical-align: top; width: 50%;} | |
| div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;} | |
| ul.task-list{list-style: none;} | |
| pre > code.sourceCode { white-space: pre; position: relative; } | |
| pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } | |
| pre > code.sourceCode > span:empty { height: 1.2em; } | |
| code.sourceCode > span { color: inherit; text-decoration: inherit; } | |
| div.sourceCode { margin: 1em 0; } | |
| pre.sourceCode { margin: 0; } | |
| @media screen { | |
| div.sourceCode { overflow: auto; } | |
| } | |
| @media print { | |
| pre > code.sourceCode { white-space: pre-wrap; } | |
| pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } | |
| } | |
| pre.numberSource code | |
| { counter-reset: source-line 0; } | |
| pre.numberSource code > span | |
| { position: relative; left: -4em; counter-increment: source-line; } | |
| pre.numberSource code > span > a:first-child::before | |
| { content: counter(source-line); | |
| position: relative; left: -1em; text-align: right; vertical-align: baseline; | |
| border: none; display: inline-block; | |
| -webkit-touch-callout: none; -webkit-user-select: none; | |
| -khtml-user-select: none; -moz-user-select: none; | |
| -ms-user-select: none; user-select: none; | |
| padding: 0 4px; width: 4em; | |
| color: #aaaaaa; | |
| } | |
| pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } | |
| div.sourceCode | |
| { } | |
| @media screen { | |
| pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } | |
| } | |
| code span.al { color: #ff0000; font-weight: bold; } /* Alert */ | |
| code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ | |
| code span.at { color: #7d9029; } /* Attribute */ | |
| code span.bn { color: #40a070; } /* BaseN */ | |
| code span.bu { } /* BuiltIn */ | |
| code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ | |
| code span.ch { color: #4070a0; } /* Char */ | |
| code span.cn { color: #880000; } /* Constant */ | |
| code span.co { color: #60a0b0; font-style: italic; } /* Comment */ | |
| code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ | |
| code span.do { color: #ba2121; font-style: italic; } /* Documentation */ | |
| code span.dt { color: #902000; } /* DataType */ | |
| code span.dv { color: #40a070; } /* DecVal */ | |
| code span.er { color: #ff0000; font-weight: bold; } /* Error */ | |
| code span.ex { } /* Extension */ | |
| code span.fl { color: #40a070; } /* Float */ | |
| code span.fu { color: #06287e; } /* Function */ | |
| code span.im { } /* Import */ | |
| code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ | |
| code span.kw { color: #007020; font-weight: bold; } /* Keyword */ | |
| code span.op { color: #666666; } /* Operator */ | |
| code span.ot { color: #007020; } /* Other */ | |
| code span.pp { color: #bc7a00; } /* Preprocessor */ | |
| code span.sc { color: #4070a0; } /* SpecialChar */ | |
| code span.ss { color: #bb6688; } /* SpecialString */ | |
| code span.st { color: #4070a0; } /* String */ | |
| code span.va { color: #19177c; } /* Variable */ | |
| code span.vs { color: #4070a0; } /* VerbatimString */ | |
| code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ | |
| </style> | |
| </head> | |
| <body> | |
| <h1 id="ivus-segmentation-and-bifurcation-detection">IVUS Segmentation and Bifurcation Detection</h1> | |
| <h2 id="comprehensive-multi-task-fine-tuning-report">Comprehensive Multi-Task Fine-Tuning Report</h2> | |
| <p>Date: February 20, 2026</p> | |
| <h2 id="1-purpose-and-scope">1) Purpose and Scope</h2> | |
| <p>This report documents the full methodology used to adapt a pretrained IVUS segmentation model into a multi-task model that performs:</p> | |
| <ul> | |
| <li>Lumen segmentation (pixel-level)</li> | |
| <li>Bifurcation detection (frame-level)</li> | |
| </ul> | |
| <p>The goal is to provide a self-contained technical description of model design, training behavior, threshold calibration, results, and limitations.</p> | |
| <h2 id="2-problem-setup">2) Problem Setup</h2> | |
| <p>Given an IVUS frame <code>x</code>, we optimize two tasks:</p> | |
| <ol> | |
| <li>Segmentation output <code>M_hat</code>: lumen mask over pixels</li> | |
| <li>Classification output <code>y_hat</code>: bifurcation probability in <code>[0,1]</code></li> | |
| </ol> | |
| <p>The model is trained at frame level. There is no temporal model (no recurrence, no sequence transformer, no optical flow objective).</p> | |
| <h2 id="3-data-and-labels">3) Data and Labels</h2> | |
| <h3 id="31-data-organization">3.1 Data organization</h3> | |
| <p>The dataset is built from a frame-bank of manually labeled IVUS frames with train/validation/test partitions.</p> | |
| <p>Split counts:</p> | |
| <ul> | |
| <li>Train: 420</li> | |
| <li>Validation: 90</li> | |
| <li>Test: 90</li> | |
| </ul> | |
| <h3 id="32-label-distributions">3.2 Label distributions</h3> | |
| <p>Bifurcation positive rate by split:</p> | |
| <ul> | |
| <li>Train: 65.2%</li> | |
| <li>Validation: 65.6%</li> | |
| <li>Test: 65.6%</li> | |
| </ul> | |
| <p>Lumen annotation coverage by split:</p> | |
| <ul> | |
| <li>Train: 47.4%</li> | |
| <li>Validation: 51.1%</li> | |
| <li>Test: 53.3%</li> | |
| </ul> | |
| <p>This means classification supervision is denser than segmentation supervision in the multi-task setting.</p> | |
| <h3 id="33-balance-visualizations">3.3 Balance visualizations</h3> | |
| <p><img src="./memo_assets/split_class_balance_stacked.png" alt="Split class balance" /> <img src="./memo_assets/positive_rate_by_split.png" alt="Positive rate by split" /> <img src="./memo_assets/lumen_coverage_by_split.png" alt="Lumen coverage by split" /></p> | |
| <h2 id="4-model-design">4) Model Design</h2> | |
| <h3 id="41-backbone--multi-task-head">4.1 Backbone + multi-task head</h3> | |
| <p>A pretrained segmentation backbone is reused as initialization.</p> | |
| <p>A lightweight <strong>multi-task classification head</strong> is attached on top of segmentation logits:</p> | |
| <ul> | |
| <li>Global average pooling over spatial dimensions</li> | |
| <li>Dense layer (ReLU)</li> | |
| <li>Dropout</li> | |
| <li>Final sigmoid output for bifurcation probability</li> | |
| </ul> | |
| <p>This is a multi-task head, not an attention module.</p> | |
| <h3 id="42-task-coupling-strategy">4.2 Task coupling strategy</h3> | |
| <p>The segmentation branch and classification branch share upstream representation. This encourages feature reuse while keeping task-specific outputs separate.</p> | |
| <h3 id="43-conceptual-architecture">4.3 Conceptual architecture</h3> | |
| <p><img src="./memo_assets/multitask_pipeline_diagram.png" alt="Multi-task training and inference diagram" /></p> | |
| <h2 id="5-preprocessing-and-input-construction">5) Preprocessing and Input Construction</h2> | |
| <p>For each frame:</p> | |
| <ol> | |
| <li>Apply central black-circle preprocessing (to suppress catheter/artifacts near center).</li> | |
| <li>Convert grayscale to network input representation.</li> | |
| <li>Align labels to frame indices.</li> | |
| </ol> | |
| <p>For segmentation labels, only frames with valid lumen polygons are supervised.</p> | |
| <h2 id="6-loss-functions-and-optimization">6) Loss Functions and Optimization</h2> | |
| <p>Let <code>i</code> index samples in a minibatch.</p> | |
| <ul> | |
| <li><code>m_i in {0,1}^{H x W}</code>: ground-truth lumen mask</li> | |
| <li><code>m_hat_i</code>: predicted lumen probability map</li> | |
| <li><code>y_i in {0,1}</code>: bifurcation label</li> | |
| <li><code>y_hat_i in (0,1)</code>: bifurcation probability</li> | |
| <li><code>h_i in {0,1}</code>: has-mask indicator (1 if segmentation label exists)</li> | |
| </ul> | |
| <h3 id="61-segmentation-loss">6.1 Segmentation loss</h3> | |
| <p>Weighted BCE + Dice:</p> | |
| <pre class="text"><code>L_seg,i = L_wbce(m_i, m_hat_i; w_pos) + lambda_dice * L_dice(m_i, m_hat_i) | |
| </code></pre> | |
| <p>Masked batch aggregation (only labeled masks contribute):</p> | |
| <pre class="text"><code>L_seg = (sum_i h_i * L_seg,i) / (sum_i h_i + eps) | |
| </code></pre> | |
| <h3 id="62-classification-loss">6.2 Classification loss</h3> | |
| <p>Binary cross entropy:</p> | |
| <pre class="text"><code>L_cls = (1/B) * sum_i L_bce(y_i, y_hat_i) | |
| </code></pre> | |
| <h3 id="63-total-objective">6.3 Total objective</h3> | |
| <pre class="text"><code>L_total = w_seg * L_seg + w_cls * L_cls | |
| </code></pre> | |
| <h3 id="64-optimization-behavior">6.4 Optimization behavior</h3> | |
| <ul> | |
| <li>GradientTape-style explicit optimization loop for multi-task fine-tuning</li> | |
| <li>Gradient clipping by global norm for stability</li> | |
| <li>Early stopping using validation objective</li> | |
| <li>Best-checkpoint restore before final export</li> | |
| </ul> | |
| <h2 id="7-threshold-selection-and-operating-point">7) Threshold Selection and Operating Point</h2> | |
| <p>After model training, bifurcation threshold <code>t</code> is selected on validation data by grid search over candidate thresholds.</p> | |
| <p>For each <code>t</code>:</p> | |
| <pre class="text"><code>y_hat_i^(t) = 1[y_hat_i >= t] | |
| </code></pre> | |
| <p>Compute precision, recall, F1, accuracy, etc., then choose:</p> | |
| <pre class="text"><code>t* = argmax_t F1_val(t) | |
| </code></pre> | |
| <p>The selected threshold is persisted and reused during runtime inference.</p> | |
| <h2 id="8-training-dynamics">8) Training Dynamics</h2> | |
| <h3 id="81-multi-task-fine-tuning-dynamics">8.1 Multi-task fine-tuning dynamics</h3> | |
| <p><img src="./memo_assets/multitask_training_dynamics.png" alt="Multi-task training dynamics" /></p> | |
| <p>Observed behavior:</p> | |
| <ul> | |
| <li>Validation classification AUC stabilizes high relatively early.</li> | |
| <li>Validation F1 is more threshold-sensitive and fluctuates more.</li> | |
| <li>Segmentation metrics remain strong but vary with sparse segmentation supervision.</li> | |
| </ul> | |
| <h3 id="82-lumen-only-fine-tuning-dynamics">8.2 Lumen-only fine-tuning dynamics</h3> | |
| <p><img src="./memo_assets/lumen_finetune_dynamics.png" alt="Lumen fine-tune dynamics" /></p> | |
| <h2 id="9-test-performance-summary">9) Test Performance Summary</h2> | |
| <h3 id="91-multi-task-test-metrics">9.1 Multi-task test metrics</h3> | |
| <p>Segmentation (subset with lumen labels):</p> | |
| <ul> | |
| <li>IoU: 0.856</li> | |
| <li>Dice: 0.923</li> | |
| </ul> | |
| <p>Bifurcation classification:</p> | |
| <ul> | |
| <li>Accuracy: 0.900</li> | |
| <li>Precision: 0.891</li> | |
| <li>Recall: 0.966</li> | |
| <li>F1: 0.927</li> | |
| <li>AUC: 0.961</li> | |
| </ul> | |
| <p>Confusion matrix:</p> | |
| <p><img src="./memo_assets/multitask_test_confusion_matrix.png" alt="Multitask confusion matrix" /></p> | |
| <p>Metric snapshot:</p> | |
| <p><img src="./memo_assets/multitask_test_metric_snapshot.png" alt="Multitask metric snapshot" /></p> | |
| <h3 id="92-segmentation-regime-comparison">9.2 Segmentation regime comparison</h3> | |
| <p><img src="./memo_assets/segmentation_regime_comparison.png" alt="Segmentation comparison" /></p> | |
| <p>Note: compared evaluations do not use identical sample sets, so the comparison is directional.</p> | |
| <h2 id="10-threshold-and-calibration-diagnostics">10) Threshold and Calibration Diagnostics</h2> | |
| <p>Standalone classifier diagnostics (supporting analysis):</p> | |
| <p><img src="./memo_assets/standalone_threshold_sweep.png" alt="Threshold sweep" /> <img src="./memo_assets/standalone_probability_hist.png" alt="Probability histogram" /> <img src="./memo_assets/standalone_reliability_diagram.png" alt="Reliability diagram" /> <img src="./memo_assets/precision_recall_curve_with_operating_point.png" alt="Precision-recall curve with operating point" /></p> | |
| <p>These plots illustrate threshold sensitivity, score separation, and calibration quality.</p> | |
| <h2 id="11-limitations">11) Limitations</h2> | |
| <h3 id="111-split-caveat-source-overlap">11.1 Split caveat: source overlap</h3> | |
| <p>Train/validation/test share source pullback files (frame-level partitioning rather than source-level partitioning).</p> | |
| <p>Because the model is frame-independent, this is not temporal leakage. However, repeated source style/statistics across splits can make in-domain metrics optimistic.</p> | |
| <p><img src="./memo_assets/split_source_overlap_heatmap.png" alt="Split source overlap" /></p> | |
| <h3 id="112-uneven-supervision-density">11.2 Uneven supervision density</h3> | |
| <p>Only about half of samples carry segmentation labels. This creates an imbalance between classification and segmentation supervision in multi-task training.</p> | |
| <h3 id="113-domain-shift-across-source-groups">11.3 Domain shift across source groups</h3> | |
| <p>Performance can vary substantially by source group.</p> | |
| <p><img src="./memo_assets/standalone_group_metrics.png" alt="Group-wise standalone metrics" /></p> | |
| <p>This indicates a need for stronger cross-source robustness analysis.</p> | |
| <h3 id="114-head-capacity-tradeoff">11.4 Head capacity tradeoff</h3> | |
| <p>The current multi-task head is intentionally lightweight. This helps stability and runtime cost, but may under-capture fine spatial context around bifurcation patterns.</p> | |
| <h2 id="12-practical-conclusions">12) Practical Conclusions</h2> | |
| <ol> | |
| <li>The current multi-task approach is effective and operationally coherent.</li> | |
| <li>Validation-driven thresholding is critical and should remain part of deployment.</li> | |
| <li>The largest methodological caveat is source-overlap evaluation, not temporal modeling leakage.</li> | |
| <li>Next major quality gain will likely come from stricter source-level split protocols and robustness-focused evaluation.</li> | |
| </ol> | |
| <h2 id="13-reproducibility-note">13) Reproducibility Note</h2> | |
| <p>This report is intended to be self-contained. Supporting figures are stored under <code>docs/memo_assets/</code>.</p> | |
| <p>PDF export command:</p> | |
| <div class="sourceCode" id="cb7"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true"></a><span class="ex">scripts/analysis/export_memo_pdf.sh</span></span></code></pre></div> | |
| </body> | |
| </html> | |