Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,109 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
+
# Model Card for AdaLoRA-QAT
|
| 5 |
+
|
| 6 |
+
AdaLoRA-QAT is an efficient, compact foundation model variant designed for accurate chest X-ray (CXR) lung segmentation.It adapts the Segment Anything Model (SAM) to meet strict clinical computational constraints by combining adaptive low-rank parameter fine-tuning with quantization-aware training.
|
| 7 |
+
|
| 8 |
+
## Model Details
|
| 9 |
+
|
| 10 |
+
### Model Description
|
| 11 |
+
|
| 12 |
+
AdaLoRA-QAT introduces a two-stage fine-tuning framework for medical image segmentation. Stage 1 utilizes Adaptive Low-Rank Adaptation (AdaLoRA) to dynamically allocate rank capacity to task-relevant transformer layers in full precision. Stage 2 implements full-model quantization-aware fine-tuning (QAT) using a selective mixed-precision strategy, achieving INT8 precision for select layers while preserving fine structural fidelity.
|
| 13 |
+
|
| 14 |
+
- **Developed by:** Prantik Deb, Srimanth Dhondy, N. Ramakrishna, Anu Kapoor, Raju S. Bapi, Tapabrata Chakraborti.
|
| 15 |
+
- **Funded by:** IHub-Data, International Institute of Information Technology Hyderabad. Tapabrata Chakraborti is supported by the Turing-Roche Strategic Partnership and the UCL NIHR Biomedical Research Center.
|
| 16 |
+
- **Model type:** Parameter-Efficient Fine-Tuned (PEFT) Foundation Model for Chest X-ray Segmentation.
|
| 17 |
+
- **License:** MIT License
|
| 18 |
+
- **Finetuned from model:** Segment Anything Model (SAM).
|
| 19 |
+
|
| 20 |
+
### Model Sources
|
| 21 |
+
|
| 22 |
+
- **Repository:** https://prantik-pdeb.github.io/adaloraqat.github.io/
|
| 23 |
+
- **Paper:** ADALORA-QAT: ADAPTIVE LOW RANK AND QUANTIZATION AWARE SEGMENTATION.
|
| 24 |
+
|
| 25 |
+
## Uses
|
| 26 |
+
|
| 27 |
+
### Direct Use
|
| 28 |
+
|
| 29 |
+
* Accurate lung field segmentation for isolating pulmonary parenchyma.
|
| 30 |
+
* Enhancing abnormality visibility and enabling quantitative analysis in chest radiographs.
|
| 31 |
+
* Improving the reliability of computer-aided diagnosis (CAD) systems.
|
| 32 |
+
* Enabling deployable foundation models on resource-constrained clinical hardware.
|
| 33 |
+
|
| 34 |
+
## Bias, Risks, and Limitations
|
| 35 |
+
|
| 36 |
+
* Robust generalization across deep learning models remains challenging due to anatomical variability.
|
| 37 |
+
* Generalization is also challenged by pathological distortions and imaging artifacts.
|
| 38 |
+
* The Structural Similarity Index (ASSIM) map indicates minor degradations primarily associated with severe motion artifacts or extreme pathologies.
|
| 39 |
+
|
| 40 |
+
## Training Details
|
| 41 |
+
|
| 42 |
+
### Training Data
|
| 43 |
+
|
| 44 |
+
* The model was trained on 64,590 chest X-rays spanning diverse thoracic pathologies.
|
| 45 |
+
* The data sources include JSRT, QaTa-COV19, COVID-19 Radiography, Chest X-Ray Pneumothorax, and COVID-QU-Ex datasets.
|
| 46 |
+
|
| 47 |
+
### Training Procedure
|
| 48 |
+
|
| 49 |
+
* The model utilizes a unified two-stage framework coupling adaptive low-rank encoder tuning with full model quantization-aware fine-tuning.
|
| 50 |
+
* Stage 1 learns adaptive and orthogonal low-rank subspaces in full precision (FP32).
|
| 51 |
+
* Stage 1 prunes redundant components to identify an efficient task-specific parameter space.
|
| 52 |
+
* Stage 2 performs full-model quantization-aware fine-tuning while freezing rank masks.
|
| 53 |
+
|
| 54 |
+
#### Training Hyperparameters
|
| 55 |
+
|
| 56 |
+
- **Training regime:** Stage 1 uses FP32 precision. Stage 2 uses a selective mixed-precision strategy. Encoder feed-forward layers, the decoder, and the prompt encoder are quantized to INT8. Attention QKV projections and AdaLoRA parameters (P, Q, A) remain in FP32.
|
| 57 |
+
- **Batch Size:** 16 during Stage 1.
|
| 58 |
+
- **Learning Rates:** In Stage 1, 5e-5 for the encoder and 2e-5 for the decoder.In Stage 2, singular values are fine-tuned at 1e-6.
|
| 59 |
+
|
| 60 |
+
#### Speeds, Sizes, Times
|
| 61 |
+
|
| 62 |
+
* The model yields a 2.24x model compression compared to base-SAM fine-tuning.
|
| 63 |
+
* Trainable parameters are reduced by 16.6x, down to 5.4M.
|
| 64 |
+
|
| 65 |
+
## Evaluation
|
| 66 |
+
|
| 67 |
+
### Testing Data, Factors & Metrics
|
| 68 |
+
|
| 69 |
+
#### Testing Data
|
| 70 |
+
|
| 71 |
+
* The 64,590 CXR dataset was divided using an 80:10:10 split for experiments.
|
| 72 |
+
|
| 73 |
+
#### Metrics
|
| 74 |
+
|
| 75 |
+
* Dice Score (DSC).
|
| 76 |
+
* Intersection over Union (IOU).
|
| 77 |
+
* Normalized Surface Distance (NSD).
|
| 78 |
+
* Structural Similarity Index (SSIM) to evaluate structural agreement and localized improvements.
|
| 79 |
+
* Wilcoxon signed-rank test for statistical significance assessment.
|
| 80 |
+
|
| 81 |
+
### Results
|
| 82 |
+
|
| 83 |
+
* AdaLoRA-QAT achieves a 95.6% Dice score.
|
| 84 |
+
* The model matches full-precision SAM decoder fine-tuning accuracy.
|
| 85 |
+
* Statistical analysis confirms that full INT8 quantization preserves segmentation accuracy without significant degradation.
|
| 86 |
+
* SSIM analysis exhibits strong structural agreement along lung boundaries and vascular regions.
|
| 87 |
+
|
| 88 |
+
#### Summary
|
| 89 |
+
|
| 90 |
+
AdaLoRA-QAT effectively balances accuracy, efficiency, and structural trustworthiness. It establishes a proof of concept for substantially compressing foundation models for scalable AI-assisted diagnosis without compromising diagnostic accuracy.
|
| 91 |
+
|
| 92 |
+
## Model Examination
|
| 93 |
+
|
| 94 |
+
* Quantization error analysis shows that FP32-INT8 quantization noise follows an approximately zero-mean Gaussian distribution.
|
| 95 |
+
* There is a strong linear correlation between FP32 and INT8 weights.
|
| 96 |
+
* Errors are uniformly distributed across weight magnitudes, confirming preserved numerical fidelity under low-bit quantization.
|
| 97 |
+
|
| 98 |
+
### Model Architecture and Objective
|
| 99 |
+
|
| 100 |
+
* The architecture is based on the Segment Anything Model (SAM) incorporating an Image Encoder (ViT), Prompt Encoder, and Mask Decoder.
|
| 101 |
+
* It uses Adaptive Low-Rank Adaptation (AdaLoRA) where the vision encoder rank is reduced from 48 to 32 via importance-based pruning.
|
| 102 |
+
|
| 103 |
+
#### Hardware
|
| 104 |
+
|
| 105 |
+
* NVIDIA RTX A6000 GPUs (48 GB).
|
| 106 |
+
|
| 107 |
+
## Model Card Authors
|
| 108 |
+
|
| 109 |
+
Prantik Deb, Srimanth Dhondy, N. Ramakrishna, Anu Kapoor, Raju S. Bapi, Tapabrata Chakraborti.
|