Model Card for AdaLoRA-QAT

AdaLoRA-QAT is an efficient, compact foundation model variant designed for accurate chest X-ray (CXR) lung segmentation.It adapts the Segment Anything Model (SAM) to meet strict clinical computational constraints by combining adaptive low-rank parameter fine-tuning with quantization-aware training.

Model Details

Model Description

AdaLoRA-QAT introduces a two-stage fine-tuning framework for medical image segmentation. Stage 1 utilizes Adaptive Low-Rank Adaptation (AdaLoRA) to dynamically allocate rank capacity to task-relevant transformer layers in full precision. Stage 2 implements full-model quantization-aware fine-tuning (QAT) using a selective mixed-precision strategy, achieving INT8 precision for select layers while preserving fine structural fidelity.

  • Developed by: Prantik Deb, Srimanth Dhondy, N. Ramakrishna, Anu Kapoor, Raju S. Bapi, Tapabrata Chakraborti.
  • Funded by: IHub-Data, International Institute of Information Technology Hyderabad. Tapabrata Chakraborti is supported by the Turing-Roche Strategic Partnership and the UCL NIHR Biomedical Research Center.
  • Model type: Parameter-Efficient Fine-Tuned (PEFT) Foundation Model for Chest X-ray Segmentation.
  • License: MIT License
  • Finetuned from model: Segment Anything Model (SAM).

Model Sources

Uses

Direct Use

  • Accurate lung field segmentation for isolating pulmonary parenchyma.
  • Enhancing abnormality visibility and enabling quantitative analysis in chest radiographs.
  • Improving the reliability of computer-aided diagnosis (CAD) systems.
  • Enabling deployable foundation models on resource-constrained clinical hardware.

Bias, Risks, and Limitations

  • Robust generalization across deep learning models remains challenging due to anatomical variability.
  • Generalization is also challenged by pathological distortions and imaging artifacts.
  • The Structural Similarity Index (ASSIM) map indicates minor degradations primarily associated with severe motion artifacts or extreme pathologies.

Training Details

Training Data

  • The model was trained on 64,590 chest X-rays spanning diverse thoracic pathologies.
  • The data sources include JSRT, QaTa-COV19, COVID-19 Radiography, Chest X-Ray Pneumothorax, and COVID-QU-Ex datasets.

Training Procedure

  • The model utilizes a unified two-stage framework coupling adaptive low-rank encoder tuning with full model quantization-aware fine-tuning.
  • Stage 1 learns adaptive and orthogonal low-rank subspaces in full precision (FP32).
  • Stage 1 prunes redundant components to identify an efficient task-specific parameter space.
  • Stage 2 performs full-model quantization-aware fine-tuning while freezing rank masks.

Training Hyperparameters

  • Training regime: Stage 1 uses FP32 precision. Stage 2 uses a selective mixed-precision strategy. Encoder feed-forward layers, the decoder, and the prompt encoder are quantized to INT8. Attention QKV projections and AdaLoRA parameters (P, Q, A) remain in FP32.
  • Batch Size: 16 during Stage 1.
  • Learning Rates: In Stage 1, 5e-5 for the encoder and 2e-5 for the decoder.In Stage 2, singular values are fine-tuned at 1e-6.

Speeds, Sizes, Times

  • The model yields a 2.24x model compression compared to base-SAM fine-tuning.
  • Trainable parameters are reduced by 16.6x, down to 5.4M.

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • The 64,590 CXR dataset was divided using an 80:10:10 split for experiments.

Metrics

  • Dice Score (DSC).
  • Intersection over Union (IOU).
  • Normalized Surface Distance (NSD).
  • Structural Similarity Index (SSIM) to evaluate structural agreement and localized improvements.
  • Wilcoxon signed-rank test for statistical significance assessment.

Results

  • AdaLoRA-QAT achieves a 95.6% Dice score.
  • The model matches full-precision SAM decoder fine-tuning accuracy.
  • Statistical analysis confirms that full INT8 quantization preserves segmentation accuracy without significant degradation.
  • SSIM analysis exhibits strong structural agreement along lung boundaries and vascular regions.

Summary

AdaLoRA-QAT effectively balances accuracy, efficiency, and structural trustworthiness. It establishes a proof of concept for substantially compressing foundation models for scalable AI-assisted diagnosis without compromising diagnostic accuracy.

Model Examination

  • Quantization error analysis shows that FP32-INT8 quantization noise follows an approximately zero-mean Gaussian distribution.
  • There is a strong linear correlation between FP32 and INT8 weights.
  • Errors are uniformly distributed across weight magnitudes, confirming preserved numerical fidelity under low-bit quantization.

Model Architecture and Objective

  • The architecture is based on the Segment Anything Model (SAM) incorporating an Image Encoder (ViT), Prompt Encoder, and Mask Decoder.
  • It uses Adaptive Low-Rank Adaptation (AdaLoRA) where the vision encoder rank is reduced from 48 to 32 via importance-based pruning.

Hardware

  • NVIDIA RTX A6000 GPUs (48 GB).

Model Card Authors

Prantik Deb, Srimanth Dhondy, N. Ramakrishna, Anu Kapoor, Raju S. Bapi, Tapabrata Chakraborti.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support