Spaces:

eduardofarina
/

MedGemma1.5ReportGenerator

Sleeping

App Files Files Community

eduardofarina commited on Jan 28

Commit

fa5d00b

verified ·

1 Parent(s): a27703f

Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

.gitignore +30 -0
README.md +74 -8
app.py +422 -0
dicom_processor.py +255 -0
model_handler.py +173 -0
requirements.txt +10 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,30 @@

+# Environment variables (contains secrets)
+.env
+# Downloaded models (large files)
+models/
+.env.local
+.env.*
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+*.egg-info/
+dist/
+build/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Logs
+*.log

README.md CHANGED Viewed

@@ -1,15 +1,81 @@
 ---
-title: MedGemma1.5ReportGenerator
-emoji: 📚
-colorFrom: purple
-colorTo: yellow
 sdk: gradio
-sdk_version: 6.4.0
-python_version: '3.12'
 app_file: app.py
 pinned: false
 license: mit
-short_description: Generate radiology reports with MedGemma 1.5
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: MedGemma 1.5 Report Generator
+emoji: 🏥
+colorFrom: blue
+colorTo: green
 sdk: gradio
+sdk_version: 5.23.3
 app_file: app.py
 pinned: false
 license: mit
 ---
+# MedGemma 1.5 DICOM Report Generator
+A Gradio-based web application that uses Google's MedGemma 1.5 model to automatically generate structured radiology reports from DICOM medical images.
+![Python](https://img.shields.io/badge/python-3.10+-blue.svg)
+![License](https://img.shields.io/badge/license-MIT-green.svg)
+## Features
+- **DICOM Processing**: Upload ZIP files containing DICOM images from CT, MR, CR, or DX studies
+- **Smart Sampling**: Configurable slice sampling per series to manage GPU memory
+- **DICOM Windowing**: Auto or manual window/level controls with CT presets (Brain, Lung, Bone, etc.)
+- **Image Preview**: Built-in gallery to visualize sampled slices before inference
+- **VRAM Estimation**: Real-time estimation of GPU memory usage based on settings
+- **Configurable Generation**: Adjustable temperature, top-p, top-k, and max tokens
+- **Custom Prompts**: Editable prompts for tailored report generation
+## Requirements
+- Python 3.10+
+- NVIDIA GPU with CUDA support (recommended: 12GB+ VRAM)
+- Hugging Face account with access to [google/medgemma-1.5-4b-it](https://huggingface.co/google/medgemma-1.5-4b-it)
+## Usage
+1. Upload a ZIP file containing DICOM images
+2. Adjust settings:
+   - **Max Slices Per Series**: Reduce for less VRAM usage
+   - **Image Size**: Smaller images use less VRAM
+   - **Windowing**: Use presets or manual WC/WW for CT images
+3. Click "Process & Preview" to see the sampled images and VRAM estimate
+4. Click "Generate Report" to create the radiology report
+## Window Presets
+| Preset | Window Center | Window Width | Use Case |
+|--------|--------------|--------------|----------|
+| Brain | 40 | 80 | Brain parenchyma |
+| Subdural | 75 | 215 | Subdural hematoma |
+| Stroke | 32 | 8 | Acute stroke |
+| Lung | -600 | 1500 | Lung parenchyma |
+| Mediastinum | 50 | 350 | Mediastinal structures |
+| Bone | 400 | 1800 | Bone windows |
+| Abdomen | 40 | 400 | Abdominal soft tissue |
+| Liver | 60 | 150 | Liver lesions |
+## Tips for Low VRAM
+- Use **Max Slices Per Series = 5-10** instead of all slices
+- Reduce **Image Size** to 256-384 pixels
+- Process one series at a time for very large studies
+## Disclaimer
+This tool is for research and educational purposes only. It is NOT intended for clinical use or medical diagnosis. Always consult qualified healthcare professionals for medical decisions.
+## License
+MIT License
+## Acknowledgments
+- [Google MedGemma](https://huggingface.co/google/medgemma-1.5-4b-it) for the medical vision-language model
+- [Gradio](https://gradio.app/) for the web interface framework
+- [PyDICOM](https://pydicom.github.io/) for DICOM file processing
+- **Claude Opus** (Anthropic) for assistance in creating this demo

app.py ADDED Viewed

	@@ -0,0 +1,422 @@

+"""
+Main Gradio application for MedGemma DICOM report drafting.
+"""
+import traceback
+from typing import Optional, Tuple, List
+import gradio as gr
+from PIL import Image
+from dicom_processor import process_dicom_study
+from model_handler import MedGemmaHandler
+model_handler: Optional[MedGemmaHandler] = None
+# Store processed data for reuse
+cached_data = {
+    "zip_bytes": None,
+    "images": None,
+    "modality": None,
+    "study_info": None
+}
+def load_model():
+    """Load the MedGemma model."""
+    global model_handler
+    if model_handler is None:
+        model_handler = MedGemmaHandler()
+        model_handler.load_model()
+    return model_handler
+def process_dicom_file(
+    file_path: str,
+    max_slices_per_series: int,
+    image_size: int,
+    window_center: float,
+    window_width: float,
+    use_auto_window: bool
+) -> Tuple[str, str, List[Image.Image]]:
+    """Process uploaded DICOM ZIP file and return preview images."""
+    global cached_data
+    try:
+        if file_path is None:
+            return "No file uploaded", "", []
+        with open(file_path, 'rb') as f:
+            zip_bytes = f.read()
+        # Use per-series sampling if max_slices_per_series > 0
+        slices_per_series = max_slices_per_series if max_slices_per_series > 0 else None
+        # Use auto window if checkbox is checked
+        wc = None if use_auto_window else window_center
+        ww = None if use_auto_window else window_width
+        modality, images, study_info = process_dicom_study(
+            zip_bytes,
+            max_slices_per_series=slices_per_series,
+            image_size=image_size,
+            window_center=wc,
+            window_width=ww
+        )
+        # Cache for later use in report generation
+        cached_data["zip_bytes"] = zip_bytes
+        cached_data["images"] = images
+        cached_data["modality"] = modality
+        cached_data["study_info"] = study_info
+        max_per_series = study_info.get('MaxSlicesPerSeries', None)
+        sampling_info = f"Max Slices Per Series: {max_per_series}" if max_per_series else "Sampling: Global (all series combined)"
+        # Get window info
+        default_wc = study_info.get('DefaultWindowCenter', 'N/A')
+        default_ww = study_info.get('DefaultWindowWidth', 'N/A')
+        window_info = f"Window: Auto (WC={default_wc}, WW={default_ww})" if use_auto_window else f"Window: Manual (WC={window_center}, WW={window_width})"
+        # Estimate VRAM usage based on actual image size
+        num_images = study_info.get('ProcessedImages', 0)
+        img_size = study_info.get('ImageSize', 896)
+        # Model base: ~8GB, per image scales with size squared
+        model_vram_gb = 8.0
+        # Base estimate for 896x896 is ~50MB, scale proportionally
+        base_per_image_mb = 50
+        size_factor = (img_size / 896) ** 2
+        per_image_vram_mb = base_per_image_mb * size_factor
+        images_vram_gb = (num_images * per_image_vram_mb) / 1024
+        total_vram_gb = model_vram_gb + images_vram_gb
+        info_text = f"""Study Information:
+Modality: {study_info['Modality']}
+Study Description: {study_info['StudyDescription']}
+Study Date: {study_info['StudyDate']}
+Patient ID: {study_info['PatientID']}
+Series Count: {study_info.get('SeriesCount', 'N/A')}
+Total Original Slices: {study_info.get('TotalOriginalSlices', 'N/A')}
+{sampling_info}
+Processed Images: {num_images}
+Image Size: {img_size}x{img_size}
+{window_info}
+--- VRAM Estimate ---
+Model: ~{model_vram_gb:.1f} GB
+Images ({num_images} x {img_size}x{img_size}): ~{images_vram_gb:.1f} GB
+Total Estimated: ~{total_vram_gb:.1f} GB
+"""
+        status = f"✓ Processed {len(images)} images from {study_info['Modality']} study"
+        return status, info_text, images
+    except Exception as e:
+        error_msg = f"Error processing DICOM: {str(e)}"
+        print(error_msg)
+        print(traceback.format_exc())
+        return error_msg, "", []
+def generate_report(
+    file_path: str,
+    max_slices_per_series: int,
+    image_size: int,
+    window_center: float,
+    window_width: float,
+    use_auto_window: bool,
+    prompt: str,
+    max_tokens: int,
+    temperature: float,
+    top_p: float,
+    top_k: int,
+    do_sample: bool,
+    progress=gr.Progress(track_tqdm=True)
+) -> str:
+    """Generate radiology report using MedGemma."""
+    global cached_data
+    try:
+        if file_path is None:
+            return "Please upload a DICOM ZIP file first."
+        progress(0, desc="Loading model...")
+        global model_handler
+        if model_handler is None:
+            model_handler = load_model()
+        # Check if we can use cached images
+        use_cache = (
+            cached_data["images"] is not None and
+            cached_data["zip_bytes"] is not None
+        )
+        if use_cache:
+            progress(0.4, desc="Using cached images...")
+            images = cached_data["images"]
+            modality = cached_data["modality"]
+        else:
+            progress(0.2, desc="Reading DICOM files...")
+            with open(file_path, 'rb') as f:
+                zip_bytes = f.read()
+            progress(0.4, desc="Processing images...")
+            slices_per_series = max_slices_per_series if max_slices_per_series > 0 else None
+            wc = None if use_auto_window else window_center
+            ww = None if use_auto_window else window_width
+            modality, images, study_info = process_dicom_study(
+                zip_bytes,
+                max_slices_per_series=slices_per_series,
+                image_size=image_size,
+                window_center=wc,
+                window_width=ww
+            )
+        progress(0.6, desc=f"Generating report with MedGemma 1.5 ({len(images)} images)...")
+        # Use custom prompt or default
+        if not prompt.strip():
+            prompt = f"You are a radiologist, please draft the full structured report for the following {modality} exam. Include the following sections: Technique, Findings, and Impression."
+        report = model_handler.generate_report(
+            images=images,
+            prompt=prompt,
+            max_new_tokens=max_tokens,
+            temperature=temperature,
+            top_p=top_p,
+            top_k=top_k,
+            do_sample=do_sample,
+        )
+        progress(1.0, desc="Complete!")
+        return report
+    except Exception as e:
+        error_msg = f"Error generating report: {str(e)}\n\n{traceback.format_exc()}"
+        print(error_msg)
+        return error_msg
+def create_interface():
+    """Create the Gradio interface."""
+    with gr.Blocks(title="MedGemma 1.5 DICOM Report Generator", theme=gr.themes.Soft()) as demo:
+        gr.Markdown("# 🏥 MedGemma 1.5 DICOM Report Generator")
+        gr.Markdown("Upload a ZIP file containing DICOM images to generate a structured radiology report.")
+        with gr.Row():
+            # Left column: Upload and settings
+            with gr.Column(scale=1):
+                file_input = gr.File(
+                    label="Upload DICOM ZIP",
+                    file_types=[".zip"],
+                    type="filepath"
+                )
+                with gr.Accordion("Image Processing Settings", open=True):
+                    max_slices_slider = gr.Slider(
+                        minimum=0,
+                        maximum=50,
+                        value=10,
+                        step=1,
+                        label="Max Slices Per Series",
+                        info="0 = use all slices. Reduce to save VRAM."
+                    )
+                    image_size_slider = gr.Slider(
+                        minimum=224,
+                        maximum=1024,
+                        value=512,
+                        step=32,
+                        label="Image Size",
+                        info="Smaller = less VRAM, lower quality"
+                    )
+                    gr.Markdown("**Windowing (for CT/X-ray)**")
+                    use_auto_window = gr.Checkbox(
+                        label="Use Auto Window (from DICOM metadata)",
+                        value=True
+                    )
+                    with gr.Row():
+                        window_center_slider = gr.Slider(
+                            minimum=-1000,
+                            maximum=3000,
+                            value=40,
+                            step=10,
+                            label="Window Center (WC)",
+                            info="e.g., Brain=40, Lung=-600, Bone=400"
+                        )
+                        window_width_slider = gr.Slider(
+                            minimum=1,
+                            maximum=4000,
+                            value=400,
+                            step=10,
+                            label="Window Width (WW)",
+                            info="e.g., Brain=80, Lung=1500, Bone=1800"
+                        )
+                process_btn = gr.Button("Process & Preview", variant="primary", size="lg")
+                status_output = gr.Textbox(
+                    label="Status",
+                    interactive=False
+                )
+                study_info_box = gr.Textbox(
+                    label="Study Information & VRAM Estimate",
+                    interactive=False,
+                    lines=14
+                )
+            # Middle column: Image preview
+            with gr.Column(scale=1):
+                gr.Markdown("### 🖼️ Image Preview")
+                gr.Markdown("*Preview of sampled slices that will be sent to the model*")
+                image_gallery = gr.Gallery(
+                    label="Sampled Slices",
+                    show_label=False,
+                    columns=4,
+                    rows=3,
+                    height=400,
+                    object_fit="contain",
+                    preview=True
+                )
+            # Right column: Generation settings and output
+            with gr.Column(scale=1):
+                prompt_input = gr.Textbox(
+                    label="Prompt",
+                    lines=3,
+                    value="You are a radiologist, please draft the full structured report for this exam. Include: Technique, Findings, and Impression.",
+                    info="Customize the prompt. Leave empty for default."
+                )
+                with gr.Accordion("Model Settings", open=False):
+                    with gr.Row():
+                        max_tokens_slider = gr.Slider(
+                            minimum=50,
+                            maximum=1000,
+                            value=350,
+                            step=10,
+                            label="Max Tokens"
+                        )
+                        temperature_slider = gr.Slider(
+                            minimum=0.0,
+                            maximum=2.0,
+                            value=0.7,
+                            step=0.1,
+                            label="Temperature"
+                        )
+                    with gr.Row():
+                        top_p_slider = gr.Slider(
+                            minimum=0.0,
+                            maximum=1.0,
+                            value=0.9,
+                            step=0.05,
+                            label="Top P"
+                        )
+                        top_k_slider = gr.Slider(
+                            minimum=1,
+                            maximum=100,
+                            value=50,
+                            step=1,
+                            label="Top K"
+                        )
+                    do_sample_checkbox = gr.Checkbox(
+                        label="Enable Sampling",
+                        value=True,
+                        info="Uncheck for deterministic output"
+                    )
+                generate_btn = gr.Button("🚀 Generate Report", variant="primary", size="lg")
+                report_output = gr.Textbox(
+                    label="Generated Report",
+                    interactive=False,
+                    lines=18,
+                    placeholder="Report will appear here..."
+                )
+        # Common window presets
+        with gr.Accordion("Window Presets (click to apply)", open=False):
+            gr.Markdown("**CT Presets:**")
+            with gr.Row():
+                brain_btn = gr.Button("Brain (40/80)", size="sm")
+                subdural_btn = gr.Button("Subdural (75/215)", size="sm")
+                stroke_btn = gr.Button("Stroke (32/8)", size="sm")
+                lung_btn = gr.Button("Lung (-600/1500)", size="sm")
+                mediastinum_btn = gr.Button("Mediastinum (50/350)", size="sm")
+                bone_btn = gr.Button("Bone (400/1800)", size="sm")
+                abdomen_btn = gr.Button("Abdomen (40/400)", size="sm")
+                liver_btn = gr.Button("Liver (60/150)", size="sm")
+        # Event handlers for presets
+        brain_btn.click(lambda: (40, 80, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        subdural_btn.click(lambda: (75, 215, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        stroke_btn.click(lambda: (32, 8, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        lung_btn.click(lambda: (-600, 1500, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        mediastinum_btn.click(lambda: (50, 350, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        bone_btn.click(lambda: (400, 1800, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        abdomen_btn.click(lambda: (40, 400, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        liver_btn.click(lambda: (60, 150, False), outputs=[window_center_slider, window_width_slider, use_auto_window])
+        # Main event handlers
+        process_btn.click(
+            fn=process_dicom_file,
+            inputs=[
+                file_input,
+                max_slices_slider,
+                image_size_slider,
+                window_center_slider,
+                window_width_slider,
+                use_auto_window
+            ],
+            outputs=[status_output, study_info_box, image_gallery]
+        )
+        generate_btn.click(
+            fn=generate_report,
+            inputs=[
+                file_input,
+                max_slices_slider,
+                image_size_slider,
+                window_center_slider,
+                window_width_slider,
+                use_auto_window,
+                prompt_input,
+                max_tokens_slider,
+                temperature_slider,
+                top_p_slider,
+                top_k_slider,
+                do_sample_checkbox
+            ],
+            outputs=[report_output]
+        )
+        gr.Markdown("---")
+        gr.Markdown("**Supported Modalities:** CT, MR, CR, DX | **Tip:** Use fewer slices and smaller image size to reduce VRAM usage")
+    return demo
+def main():
+    """Main entry point."""
+    print("Starting MedGemma 1.5 DICOM Report Generator...")
+    print("Note: The model will be loaded on first report generation.")
+    demo = create_interface()
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,
+        show_error=True
+    )
+if __name__ == "__main__":
+    main()

dicom_processor.py ADDED Viewed

	@@ -0,0 +1,255 @@

+"""
+DICOM utilities for processing medical imaging studies.
+"""
+import io
+import zipfile
+from typing import List, Tuple, Dict, Optional
+import numpy as np
+from PIL import Image
+import pydicom
+def has_pixel_data(ds: pydicom.Dataset) -> bool:
+    """Check if DICOM dataset has pixel data."""
+    return (
+        'PixelData' in ds or
+        'FloatPixelData' in ds or
+        'DoubleFloatPixelData' in ds
+    )
+def extract_dicom_from_zip(zip_bytes: bytes) -> List[Tuple[str, pydicom.Dataset]]:
+    """Extract DICOM files from a ZIP archive, filtering out non-image files."""
+    dicom_files = []
+    with zipfile.ZipFile(io.BytesIO(zip_bytes), 'r') as zip_ref:
+        for filename in zip_ref.namelist():
+            if filename.lower().endswith('.dcm'):
+                try:
+                    file_bytes = zip_ref.read(filename)
+                    ds = pydicom.dcmread(io.BytesIO(file_bytes))
+                    # Skip files without pixel data (SR, reports, dose records, etc.)
+                    if has_pixel_data(ds):
+                        dicom_files.append((filename, ds))
+                    else:
+                        print(f"Skipping {filename}: No pixel data (likely SR or report)")
+                except Exception as e:
+                    print(f"Error reading {filename}: {e}")
+    return dicom_files
+def get_modality(ds: pydicom.Dataset) -> str:
+    return getattr(ds, 'Modality', 'Unknown')
+def get_study_info(ds: pydicom.Dataset, total_slices: int) -> Dict:
+    return {
+        'StudyInstanceUID': getattr(ds, 'StudyInstanceUID', 'Unknown'),
+        'StudyDescription': getattr(ds, 'StudyDescription', 'Unknown'),
+        'Modality': get_modality(ds),
+        'TotalSlices': total_slices,
+        'StudyDate': getattr(ds, 'StudyDate', 'Unknown'),
+        'PatientID': getattr(ds, 'PatientID', 'Unknown'),
+    }
+def get_default_window(ds: pydicom.Dataset) -> Tuple[Optional[float], Optional[float]]:
+    """Get default window center and width from DICOM metadata."""
+    wc = getattr(ds, 'WindowCenter', None)
+    ww = getattr(ds, 'WindowWidth', None)
+    # Handle multi-valued windows (take first)
+    if wc is not None:
+        wc = float(wc[0]) if hasattr(wc, '__iter__') and not isinstance(wc, str) else float(wc)
+    if ww is not None:
+        ww = float(ww[0]) if hasattr(ww, '__iter__') and not isinstance(ww, str) else float(ww)
+    return wc, ww
+def apply_windowing(
+    pixel_array: np.ndarray,
+    ds: pydicom.Dataset,
+    window_center: Optional[float] = None,
+    window_width: Optional[float] = None
+) -> np.ndarray:
+    """Apply rescale slope/intercept and windowing to pixel array."""
+    # Apply rescale slope and intercept (converts to HU for CT)
+    slope = getattr(ds, 'RescaleSlope', 1)
+    intercept = getattr(ds, 'RescaleIntercept', 0)
+    pixel_array = pixel_array.astype(np.float32) * slope + intercept
+    # Get window values
+    if window_center is None or window_width is None:
+        default_wc, default_ww = get_default_window(ds)
+        if window_center is None:
+            window_center = default_wc
+        if window_width is None:
+            window_width = default_ww
+    # Apply windowing if we have valid values
+    if window_center is not None and window_width is not None and window_width > 0:
+        min_val = window_center - window_width / 2
+        max_val = window_center + window_width / 2
+        pixel_array = np.clip(pixel_array, min_val, max_val)
+        normalized = ((pixel_array - min_val) / (max_val - min_val) * 255).astype(np.uint8)
+    else:
+        # Fallback: normalize to full range
+        pixel_min = pixel_array.min()
+        pixel_max = pixel_array.max()
+        if pixel_max > pixel_min:
+            normalized = ((pixel_array - pixel_min) / (pixel_max - pixel_min) * 255).astype(np.uint8)
+        else:
+            normalized = np.zeros_like(pixel_array, dtype=np.uint8)
+    return normalized
+def dicom_to_pil(
+    ds: pydicom.Dataset,
+    size: Tuple[int, int] = (896, 896),
+    window_center: Optional[float] = None,
+    window_width: Optional[float] = None
+) -> Image.Image:
+    """Convert DICOM dataset to PIL Image with optional windowing and resizing."""
+    pixel_array = ds.pixel_array
+    normalized = apply_windowing(pixel_array, ds, window_center, window_width)
+    if len(normalized.shape) == 2:
+        pil_image = Image.fromarray(normalized, mode='L')
+    elif len(normalized.shape) == 3 and normalized.shape[2] <= 4:
+        if normalized.shape[2] == 1:
+            pil_image = Image.fromarray(normalized[:, :, 0], mode='L')
+        elif normalized.shape[2] == 3:
+            pil_image = Image.fromarray(normalized, mode='RGB')
+        elif normalized.shape[2] == 4:
+            pil_image = Image.fromarray(normalized[:, :, :3], mode='RGB')
+        else:
+            pil_image = Image.fromarray(normalized[:, :, 0], mode='L')
+    else:
+        pil_image = Image.fromarray(normalized[0], mode='L')
+    if pil_image.mode != 'RGB':
+        pil_image = pil_image.convert('RGB')
+    pil_image = pil_image.resize(size, Image.LANCZOS)
+    return pil_image
+def organize_by_series(dicom_files: List[Tuple[str, pydicom.Dataset]]) -> Dict[str, List[Tuple[str, pydicom.Dataset]]]:
+    series_dict = {}
+    for filename, ds in dicom_files:
+        series_uid = getattr(ds, 'SeriesInstanceUID', 'Unknown')
+        if series_uid not in series_dict:
+            series_dict[series_uid] = []
+        series_dict[series_uid].append((filename, ds))
+    return series_dict
+def sort_slices_by_position(series_files: List[Tuple[str, pydicom.Dataset]]) -> List[Tuple[str, pydicom.Dataset]]:
+    def get_sort_key(item):
+        filename, ds = item
+        instance_num = getattr(ds, 'InstanceNumber', None)
+        if instance_num is not None:
+            return (0, int(instance_num))
+        slice_loc = getattr(ds, 'SliceLocation', None)
+        if slice_loc is not None:
+            return (1, float(slice_loc))
+        return (2, filename)
+    return sorted(series_files, key=get_sort_key)
+def sample_slices_evenly(all_slices: List[Tuple[str, pydicom.Dataset]], max_slices: int = 500) -> List[Tuple[str, pydicom.Dataset]]:
+    if len(all_slices) <= max_slices:
+        return all_slices
+    indices = [int(i * (len(all_slices) - 1) / (max_slices - 1)) for i in range(max_slices)]
+    return [all_slices[i] for i in indices]
+def process_dicom_study(
+    zip_bytes: bytes,
+    max_slices: int = 500,
+    max_slices_per_series: Optional[int] = None,
+    image_size: int = 896,
+    window_center: Optional[float] = None,
+    window_width: Optional[float] = None
+) -> Tuple[str, List[Image.Image], Dict]:
+    """
+    Process a DICOM study from a ZIP file.
+    Args:
+        zip_bytes: ZIP file contents
+        max_slices: Maximum total slices across all series (used if max_slices_per_series is None)
+        max_slices_per_series: If set, sample this many slices evenly from each series
+        image_size: Output image size (square, e.g., 896 for 896x896)
+        window_center: Window center for display (None = use DICOM default or auto)
+        window_width: Window width for display (None = use DICOM default or auto)
+    """
+    dicom_files = extract_dicom_from_zip(zip_bytes)
+    if not dicom_files:
+        raise ValueError("No valid DICOM files found in the ZIP archive")
+    first_ds = dicom_files[0][1]
+    modality = get_modality(first_ds)
+    # Get default window from first image
+    default_wc, default_ww = get_default_window(first_ds)
+    series_dict = organize_by_series(dicom_files)
+    # Count total original slices
+    total_original_slices = sum(len(files) for files in series_dict.values())
+    # Sample slices per series or globally
+    sampled_slices = []
+    if max_slices_per_series is not None:
+        # Sample evenly from each series
+        for series_uid, series_files in series_dict.items():
+            sorted_slices = sort_slices_by_position(series_files)
+            series_sampled = sample_slices_evenly(sorted_slices, max_slices_per_series)
+            sampled_slices.extend(series_sampled)
+    else:
+        # Original behavior: sample globally
+        all_sorted_slices = []
+        for series_uid, series_files in series_dict.items():
+            sorted_slices = sort_slices_by_position(series_files)
+            all_sorted_slices.extend(sorted_slices)
+        sampled_slices = sample_slices_evenly(all_sorted_slices, max_slices)
+    sampled_count = len(sampled_slices)
+    study_info = get_study_info(first_ds, sampled_count)
+    study_info['SeriesCount'] = len(series_dict)
+    study_info['TotalOriginalSlices'] = total_original_slices
+    study_info['SampledSlices'] = sampled_count
+    study_info['ImageSize'] = image_size
+    study_info['DefaultWindowCenter'] = default_wc
+    study_info['DefaultWindowWidth'] = default_ww
+    if max_slices_per_series is not None:
+        study_info['MaxSlicesPerSeries'] = max_slices_per_series
+    images = []
+    for filename, ds in sampled_slices:
+        try:
+            pil_image = dicom_to_pil(
+                ds,
+                size=(image_size, image_size),
+                window_center=window_center,
+                window_width=window_width
+            )
+            images.append(pil_image)
+        except Exception as e:
+            print(f"Error converting {filename}: {e}")
+    study_info['ProcessedImages'] = len(images)
+    return modality, images, study_info

model_handler.py ADDED Viewed

	@@ -0,0 +1,173 @@

+"""
+Model handler for MedGemma 1.5 inference.
+"""
+import os
+import torch
+from PIL import Image
+from typing import List, Optional
+from dotenv import load_dotenv
+from transformers import AutoProcessor, AutoModelForImageTextToText
+# Load environment variables from .env file
+load_dotenv()
+def check_gpu_availability():
+    """Check GPU availability and print diagnostics."""
+    print("=" * 60)
+    print("GPU Availability Check")
+    print("=" * 60)
+    cuda_available = torch.cuda.is_available()
+    print(f"CUDA available: {cuda_available}")
+    if cuda_available:
+        device_count = torch.cuda.device_count()
+        print(f"Number of GPUs: {device_count}")
+        for i in range(device_count):
+            device_name = torch.cuda.get_device_name(i)
+            print(f"  GPU {i}: {device_name}")
+        print(f"Current GPU: {torch.cuda.current_device()}")
+    else:
+        print("CUDA is not available. Model will use CPU (slow).")
+        print("\nTo use GPU, ensure you have:")
+        print("1. NVIDIA GPU with CUDA support")
+        print("2. CUDA toolkit installed")
+        print("3. PyTorch with CUDA support: pip install torch --index-url https://download.pytorch.org/whl/cu118")
+    print("=" * 60)
+    return cuda_available
+class MedGemmaHandler:
+    """Handler for MedGemma 1.5 model inference."""
+    def __init__(self, model_id: str = "google/medgemma-1.5-4b-it", device: Optional[str] = None):
+        self.model_id = model_id
+        self.device = device
+        self.processor = None
+        self.model = None
+        # Check for local model path (useful for local development)
+        local_model_path = os.path.join(os.path.dirname(__file__), "models", "medgemma-1.5-4b-it")
+        if os.path.exists(local_model_path) and os.path.isfile(os.path.join(local_model_path, "config.json")):
+            self.model_id = local_model_path
+            print(f"Using local model from: {local_model_path}")
+        else:
+            print(f"Using model from Hugging Face Hub: {self.model_id}")
+    def load_model(self):
+        """Load the MedGemma 1.5 model and processor."""
+        print(f"Loading MedGemma model: {self.model_id}")
+        # Check GPU availability
+        cuda_available = check_gpu_availability()
+        # Determine device
+        if self.device is None:
+            if cuda_available:
+                self.device = "cuda"
+                print(f"Using GPU: {torch.cuda.get_device_name(0)}")
+            else:
+                self.device = "cpu"
+                print("WARNING: Using CPU - this will be very slow!")
+        else:
+            print(f"Using device: {self.device}")
+        # Get HF token from environment
+        hf_token = os.getenv("HF_TOKEN")
+        if hf_token:
+            print("Using Hugging Face token from .env file")
+        else:
+            print("Warning: No HF_TOKEN found in .env file")
+        self.processor = AutoProcessor.from_pretrained(self.model_id, token=hf_token)
+        # Load model with proper device configuration
+        if self.device == "cuda" and torch.cuda.is_available():
+            print("Loading model on GPU with bfloat16...")
+            self.model = AutoModelForImageTextToText.from_pretrained(
+                self.model_id,
+                torch_dtype=torch.bfloat16,
+                device_map="cuda",
+                token=hf_token,
+            )
+        else:
+            print("Loading model on CPU (this may take a while)...")
+            self.model = AutoModelForImageTextToText.from_pretrained(
+                self.model_id,
+                torch_dtype=torch.float32,  # Use float32 for CPU
+                device_map="cpu",
+                token=hf_token,
+            )
+        print(f"Model loaded on device: {next(self.model.parameters()).device}")
+        print("Model loaded successfully!")
+    def generate_report(
+        self,
+        images: List[Image.Image],
+        prompt: str,
+        max_new_tokens: int = 350,
+        temperature: float = 0.7,
+        top_p: float = 0.9,
+        top_k: int = 50,
+        do_sample: bool = True,
+    ) -> str:
+        """Generate a radiology report from medical images."""
+        if self.model is None or self.processor is None:
+            raise RuntimeError("Model not loaded. Call load_model() first.")
+        content = [{"type": "image", "image": img} for img in images]
+        content.append({"type": "text", "text": prompt})
+        messages = [
+            {
+                "role": "user",
+                "content": content
+            }
+        ]
+        inputs = self.processor.apply_chat_template(
+            messages,
+            add_generation_prompt=True,
+            tokenize=True,
+            return_dict=True,
+            return_tensors="pt"
+        )
+        # Move to device with proper dtype
+        if self.device == "cuda":
+            inputs = inputs.to(self.model.device, dtype=torch.bfloat16)
+        else:
+            inputs = inputs.to(self.model.device)
+        input_len = inputs["input_ids"].shape[-1]
+        with torch.inference_mode():
+            if do_sample and temperature > 0:
+                generation = self.model.generate(
+                    **inputs,
+                    max_new_tokens=max_new_tokens,
+                    do_sample=True,
+                    temperature=temperature,
+                    top_p=top_p,
+                    top_k=top_k,
+                )
+            else:
+                generation = self.model.generate(
+                    **inputs,
+                    max_new_tokens=max_new_tokens,
+                    do_sample=False,
+                )
+            generation = generation[0][input_len:]
+        report = self.processor.decode(generation, skip_special_tokens=True)
+        # Clear GPU cache after inference
+        if self.device == "cuda":
+            torch.cuda.empty_cache()
+            print("GPU cache cleared.")
+        return report

requirements.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+gradio>=4.0.0
+transformers>=4.50.0
+torch>=2.2.0
+torchvision
+accelerate
+pydicom>=2.4.0
+Pillow>=10.0.0
+numpy>=1.24.0,<2.0
+python-dotenv>=1.0.0
+scipy