copilot-swe-agent[bot] raylim commited on
Commit
315cd39
Β·
1 Parent(s): 71ae2f0

Enhance documentation with additional details

Browse files

- Add docstrings to __init__.py files
- Add architecture overview section to ARCHITECTURE.md
- Add dependencies and data flow sections
- Add design principles and future enhancements
- Link to ARCHITECTURE.md from README

Co-authored-by: raylim <3074310+raylim@users.noreply.github.com>

ARCHITECTURE.md CHANGED
@@ -1,6 +1,16 @@
1
 
2
  # Mosaic Architecture
3
 
 
 
 
 
 
 
 
 
 
 
4
  ## Module Structure
5
 
6
  The Mosaic application has been refactored for better readability and maintainability. The codebase is now organized into the following modules:
@@ -94,3 +104,65 @@ Analysis Logic | - | 200 lines | +200
94
 
95
  The refactoring distributed the original monolithic file into focused, maintainable modules while maintaining all functionality.
96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
  # Mosaic Architecture
3
 
4
+ This document describes the internal architecture and module organization of the Mosaic application.
5
+
6
+ ## Overview
7
+
8
+ Mosaic is a deep learning pipeline for analyzing H&E whole slide images (WSIs) to predict:
9
+ 1. **Cancer Subtypes** using the Aeon model
10
+ 2. **Biomarkers** using the Paladin model
11
+
12
+ The application is organized into several focused modules with clear separation of concerns.
13
+
14
  ## Module Structure
15
 
16
  The Mosaic application has been refactored for better readability and maintainability. The codebase is now organized into the following modules:
 
104
 
105
  The refactoring distributed the original monolithic file into focused, maintainable modules while maintaining all functionality.
106
 
107
+ ## Key Dependencies
108
+
109
+ ### External Libraries
110
+
111
+ - **Gradio**: Web interface framework for creating the UI
112
+ - **PyTorch**: Deep learning framework for model inference
113
+ - **Pandas**: Data manipulation and CSV handling
114
+ - **Mussel**: Pathology-specific utilities for:
115
+ - Tissue segmentation
116
+ - Feature extraction (CTransPath, Optimus)
117
+ - Marker classification
118
+ - **Paladin**: Biomarker prediction models
119
+ - **HuggingFace Hub**: Model downloading and management
120
+ - **Loguru**: Logging with enhanced features
121
+
122
+ ### Model Components
123
+
124
+ 1. **CTransPath**: Pre-trained vision transformer for histopathology feature extraction
125
+ 2. **Optimus**: Foundation model for pathology image features
126
+ 3. **Marker Classifier**: Filters features to tumor-relevant regions
127
+ 4. **Aeon**: Multi-task model for cancer subtype classification
128
+ 5. **Paladin**: Suite of models for biomarker prediction across cancer subtypes
129
+
130
+ ## Data Flow
131
+
132
+ ```
133
+ WSI File (*.svs, *.tif)
134
+ ↓
135
+ Tissue Segmentation (Mussel)
136
+ ↓
137
+ CTransPath Feature Extraction
138
+ ↓
139
+ Marker Classification (filter to tumor regions)
140
+ ↓
141
+ Optimus Feature Extraction (on filtered tiles)
142
+ ↓
143
+ β”œβ”€β”€ Aeon Inference β†’ Cancer Subtype Predictions
144
+ β”‚ ↓
145
+ └── Paladin Inference β†’ Biomarker Predictions
146
+ ↓
147
+ Results (CSV, Visualizations)
148
+ ```
149
+
150
+ ## Design Principles
151
+
152
+ 1. **Modularity**: Each component has a single, well-defined responsibility
153
+ 2. **Testability**: Modules can be tested independently with mocking
154
+ 3. **Reusability**: Core analysis functions can be used without UI
155
+ 4. **Maintainability**: Clear interfaces and documentation
156
+ 5. **Extensibility**: New models or features can be added with minimal changes
157
+
158
+ ## Future Enhancements
159
+
160
+ Potential areas for extension:
161
+
162
+ - Support for additional image formats
163
+ - Real-time analysis progress tracking
164
+ - Integration with PACS systems
165
+ - Support for additional biomarkers
166
+ - Batch processing optimization
167
+ - Cloud deployment configurations
168
+
README.md CHANGED
@@ -19,6 +19,7 @@ Mosaic is a deep learning model designed for predicting cancer subtypes and biom
19
  - [Cancer Subtypes](#cancer-subtypes)
20
  - [Troubleshooting](#troubleshooting)
21
  - [Contributing](#contributing)
 
22
  - [License](#license)
23
 
24
  ### System requirements
@@ -293,6 +294,10 @@ If the default port 7860 is already in use:
293
 
294
  We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to this project.
295
 
 
 
 
 
296
  ## License
297
 
298
  This project is licensed under the terms specified in the LICENSE file.
 
19
  - [Cancer Subtypes](#cancer-subtypes)
20
  - [Troubleshooting](#troubleshooting)
21
  - [Contributing](#contributing)
22
+ - [Architecture](#architecture)
23
  - [License](#license)
24
 
25
  ### System requirements
 
294
 
295
  We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to this project.
296
 
297
+ ## Architecture
298
+
299
+ For detailed information about the code structure and module organization, see [ARCHITECTURE.md](ARCHITECTURE.md).
300
+
301
  ## License
302
 
303
  This project is licensed under the terms specified in the LICENSE file.
src/mosaic/__init__.py CHANGED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ """Mosaic: H&E Whole Slide Image Cancer Subtype and Biomarker Inference.
2
+
3
+ Mosaic is a deep learning pipeline for analyzing H&E-stained whole slide images
4
+ to predict cancer subtypes (via Aeon) and biomarkers (via Paladin).
5
+ """
src/mosaic/inference/__init__.py CHANGED
@@ -1,2 +1,9 @@
 
 
 
 
 
 
 
1
  from .aeon import run as run_aeon
2
  from .paladin import run as run_paladin
 
1
+ """Inference module for Mosaic.
2
+
3
+ This module provides the inference interfaces for:
4
+ - Aeon: Cancer subtype prediction from WSI features
5
+ - Paladin: Biomarker prediction from WSI features
6
+ """
7
+
8
  from .aeon import run as run_aeon
9
  from .paladin import run as run_paladin
src/mosaic/ui/__init__.py CHANGED
@@ -1,3 +1,9 @@
 
 
 
 
 
 
1
  from .app import launch_gradio
2
 
3
  __all__ = ["launch_gradio"]
 
1
+ """UI module for Mosaic Gradio web interface.
2
+
3
+ This module provides the web-based user interface for Mosaic,
4
+ allowing interactive analysis of whole slide images through a browser.
5
+ """
6
+
7
  from .app import launch_gradio
8
 
9
  __all__ = ["launch_gradio"]