Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
Mosaic Architecture
This document describes the internal architecture and module organization of the Mosaic application.
Overview
Mosaic is a deep learning pipeline for analyzing H&E whole slide images (WSIs) to predict:
- Cancer Subtypes using the Aeon model
- Biomarkers using the Paladin model
The application is organized into several focused modules with clear separation of concerns.
Module Structure
The Mosaic application has been refactored for better readability and maintainability. The codebase is now organized into the following modules:
Core Modules
mosaic.gradio_app (Main Entry Point)
- Location:
src/mosaic/gradio_app.py - Purpose: CLI entry point and command-line argument parsing
- Responsibilities:
- Command-line argument parsing
- Model downloading and initialization
- Single slide and batch processing CLI modes
- Launching the Gradio web UI
mosaic.analysis
- Location:
src/mosaic/analysis.py - Purpose: Core slide analysis logic
- Responsibilities:
- Tissue segmentation
- Feature extraction (CTransPath and Optimus)
- Feature filtering with marker classifier
- Aeon inference (cancer subtype prediction)
- Paladin inference (biomarker prediction)
- Key Function:
analyze_slide()
mosaic.ui Package
Location:
src/mosaic/ui/Purpose: Gradio web interface components
Submodules:
ui.__init__.py: Exports the mainlaunch_gradiofunctionui.app: Gradio interface definition- UI layout and component definitions
- Event handlers for user interactions
- Multi-slide analysis workflow
- Key Functions:
launch_gradio(),analyze_slides(),set_cancer_subtype_maps()
ui.utils: UI utility functions- Settings validation
- CSV file handling
- OncoTree API integration
- User session directory management
- Key Functions:
validate_settings(),load_settings(),get_oncotree_code_name(),create_user_directory()
Inference Modules
mosaic.inference
- Location:
src/mosaic/inference/ - Purpose: ML model inference implementations
- Submodules:
aeon.py: Cancer subtype inferencepaladin.py: Biomarker inferencedata.py: Data structures and utilities
Code Organization Benefits
- Separation of Concerns: UI, analysis, and CLI logic are now clearly separated
- Improved Maintainability: Each module has a single, well-defined responsibility
- Better Testability: Individual modules can be tested independently
- Enhanced Readability: Reduced file sizes and clear module boundaries
- Reusability: Analysis functions can be imported and used without UI dependencies
Import Flow
gradio_app.main()
βββ download_and_process_models()
β βββ set_cancer_subtype_maps() [from ui.app]
β βββ get_oncotree_code_name() [from ui.utils]
βββ analyze_slide() [from analysis]
β βββ segment_tissue() [from mussel]
β βββ get_features() [from mussel]
β βββ filter_features() [from mussel]
β βββ run_aeon() [from inference]
β βββ run_paladin() [from inference]
βββ launch_gradio() [from ui]
βββ analyze_slides() [from ui.app]
β βββ analyze_slide() [from analysis]
βββ validate_settings() [from ui.utils]
File Size Comparison
| File | Original | Refactored | Change |
|---|---|---|---|
gradio_app.py |
843 lines | 230 lines | -73% |
| UI Components | - | 474 lines | +474 |
| Analysis Logic | - | 200 lines | +200 |
The refactoring distributed the original monolithic file into focused, maintainable modules while maintaining all functionality.
Key Dependencies
External Libraries
- Gradio: Web interface framework for creating the UI
- PyTorch: Deep learning framework for model inference
- Pandas: Data manipulation and CSV handling
- Mussel: Pathology-specific utilities for:
- Tissue segmentation
- Feature extraction (CTransPath, Optimus)
- Marker classification
- Paladin: Biomarker prediction models
- HuggingFace Hub: Model downloading and management
- Loguru: Logging with enhanced features
Model Components
- CTransPath: Pre-trained vision transformer for histopathology feature extraction
- Optimus: Foundation model for pathology image features
- Marker Classifier: Filters features to tumor-relevant regions
- Aeon: Multi-task model for cancer subtype classification
- Paladin: Suite of models for biomarker prediction across cancer subtypes
Data Flow
WSI File (*.svs, *.tif)
β
Tissue Segmentation (Mussel)
β
CTransPath Feature Extraction
β
Marker Classification (filter to tumor regions)
β
Optimus Feature Extraction (on filtered tiles)
β
βββ Aeon Inference β Cancer Subtype Predictions
β β
βββ Paladin Inference β Biomarker Predictions
β
Results (CSV, Visualizations)
Design Principles
- Modularity: Each component has a single, well-defined responsibility
- Testability: Modules can be tested independently with mocking
- Reusability: Core analysis functions can be used without UI
- Maintainability: Clear interfaces and documentation
- Extensibility: New models or features can be added with minimal changes
Future Enhancements
Potential areas for extension:
- Support for additional image formats
- Real-time analysis progress tracking
- Integration with PACS systems
- Support for additional biomarkers
- Batch processing optimization
- Cloud deployment configurations