Clinical Context

Preoperative liver volumetry is a critical step in hepatic surgery planning. Accurate estimation of tumor burden and residual liver volume directly influences surgical feasibility and helps prevent postoperative liver failure.

Traditional segmentation methods are often manual or semi-automatic, leading to:

High time consumption
Strong dependency on clinical expertise
Limited reproducibility

This project proposes a fully automated pipeline that integrates:

Liver and tumor segmentation from CT scans
Quantitative volumetric computation
Multimodal reasoning using MedGemma
Automated structured clinical report generation

Pipeline Overview

The system transforms a raw CT image into a structured clinical interpretation through four main stages:

Model Description

This pipeline integrates convolutional neural networks for medical image segmentation with a quantized large language model for structured medical report generation.

The system combines:

A U-Net model for liver segmentation
A ResU-Net model for tumor segmentation
A quantized MedGemma 1.5-4B model for automated medical reasoning and report generation

After segmentation, binary prediction masks are used to compute:

Total liver volume
Tumor volume
Tumor-to-liver volume ratio

These quantitative results, along with segmentation summaries, are provided as structured input to MedGemma, which generates an automated clinical-style report. The original base model used for quantization is: MedGemma 1.5-4B (Google)
https://huggingface.co/google/medgemma-1.5-4b

Quantization

The MedGemma 1.5-4B model was quantized to 4-bit precision using the bitsandbytes library in order to:

Reduce GPU memory usage
Enable deployment on hardware with limited computational resources
Maintain acceptable performance while optimizing inference speed

Training Details

Dataset: Images from the public 3Dircadb dataset (3D Image Reconstruction for Comparison of Algorithm Database). The original CT volumes were converted into 2D slices and saved in JPEG format for training.
Input size: 256x256
Framework: TensorFlow (segmentation), Transformers (MedGemma)
Hardware: NVIDIA GPU

Multimodal Prompt Construction

The input prompt to MedGemma includes:

The image with segmentation overlay
Structured volumetric values:
- Liver volume
- Tumor volume
- Tumor ratio This multimodal design allows MedGemma to contextualize quantitative metrics using visual evidence, simulating radiological reasoning.

Automated Clinical Report Generation

From the multimodal prompt, MedGemma generates:

Interpretation of tumor burden
Estimation of relative tumor size
Clinical severity insights
Decision-support suggestions (monitoring, surgery consideration)

⚠ This system is not intended to replace medical expertise but to assist rapid and standardized interpretation.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support