raylim commited on
Commit
c5e7bb2
·
unverified ·
2 Parent(s): 24b5de2 a3010ef

Merge pull request #7 from pathology-data-mining/dev

Browse files

Add comprehensive sex and tissue site parameter support

.gitignore CHANGED
@@ -17,3 +17,5 @@ data/
17
  htmlcov/
18
  flagged/
19
  gradio_cached_examples/
 
 
 
17
  htmlcov/
18
  flagged/
19
  gradio_cached_examples/
20
+ *.svs
21
+ *.png
pyproject.toml CHANGED
@@ -10,11 +10,14 @@ readme = "README.md"
10
  requires-python = ">=3.10"
11
  dependencies = [
12
  "gradio>=5.49.0",
 
13
  "loguru>=0.7.3",
14
  "memory-profiler>=0.61.0",
15
  "mussel[torch-gpu]",
16
  "paladin",
 
17
  "spaces>=0.30.0",
 
18
  ]
19
 
20
  [project.scripts]
 
10
  requires-python = ">=3.10"
11
  dependencies = [
12
  "gradio>=5.49.0",
13
+ "lightning>=2.6.0",
14
  "loguru>=0.7.3",
15
  "memory-profiler>=0.61.0",
16
  "mussel[torch-gpu]",
17
  "paladin",
18
+ "seaborn>=0.13.2",
19
  "spaces>=0.30.0",
20
+ "statsmodels>=0.14.6",
21
  ]
22
 
23
  [project.scripts]
scripts/README.md ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Mosaic Scripts
2
+
3
+ This directory contains utility scripts for working with the Mosaic pipeline, particularly for Aeon model testing and deployment.
4
+
5
+ ## Aeon Model Scripts
6
+
7
+ ### 1. export_aeon_checkpoint.py
8
+
9
+ Export PyTorch Lightning checkpoint to pickle format for inference.
10
+
11
+ **Usage:**
12
+ ```bash
13
+ python scripts/export_aeon_checkpoint.py \
14
+ --checkpoint data/checkpoint.ckpt \
15
+ --output data/aeon_model.pkl \
16
+ --metadata-dir data/metadata
17
+ ```
18
+
19
+ **Arguments:**
20
+ - `--checkpoint`: Path to PyTorch Lightning checkpoint (.ckpt file)
21
+ - `--output`: Path to save exported model (.pkl file)
22
+ - `--metadata-dir`: Directory containing metadata files (default: data/metadata)
23
+
24
+ **Requirements:**
25
+ - paladin package from git repo (must have AeonLightningModule)
26
+ - PyTorch Lightning
27
+ - Metadata files: n_classes.txt, ontology_embedding_dim.txt, target_dict.tsv
28
+
29
+ **Example:**
30
+ ```bash
31
+ # Export the checkpoint
32
+ uv run python scripts/export_aeon_checkpoint.py \
33
+ --checkpoint data/checkpoint.ckpt \
34
+ --output data/aeon_model.pkl
35
+
36
+ # Output:
37
+ # Loading metadata from data/metadata...
38
+ # Loading checkpoint from data/checkpoint.ckpt...
39
+ # Saving model to data/aeon_model.pkl...
40
+ # ✓ Successfully exported checkpoint to data/aeon_model.pkl
41
+ # Model size: 118.0 MB
42
+ # Model class: AeonLateAggregator
43
+ # Number of classes: 160
44
+ # Ontology embedding dim: 20
45
+ # Number of histologies: 160
46
+ ```
47
+
48
+ ### 2. run_aeon_tests.sh
49
+
50
+ Run the Aeon model on test slides and validate predictions.
51
+
52
+ **Usage:**
53
+ ```bash
54
+ ./scripts/run_aeon_tests.sh
55
+ ```
56
+
57
+ **Configuration:**
58
+ The script reads test samples from `test_slides/test_samples.json` and processes each slide through the full Mosaic pipeline with:
59
+ - Cancer subtype: Unknown (triggers Aeon inference)
60
+ - Segmentation config: Biopsy
61
+ - Number of workers: 4
62
+
63
+ **Output:**
64
+ - Results saved to `test_slides/results/{slide_id}/`
65
+ - Logs saved to `test_slides/logs/`
66
+ - Summary showing passed/failed tests
67
+
68
+ **Example Output:**
69
+ ```
70
+ =========================================
71
+ Aeon Model Test Suite
72
+ =========================================
73
+
74
+ Found 3 test slides
75
+
76
+ =========================================
77
+ Processing slide 1/3: 881837
78
+ =========================================
79
+ Ground Truth:
80
+ Cancer Subtype: BLCA
81
+ Site Type: Primary
82
+ Sex: Male
83
+ Tissue Site: Bladder
84
+
85
+ Running Mosaic pipeline...
86
+
87
+ Aeon Prediction:
88
+ Predicted: BLCA
89
+ Confidence: 0.9819
90
+
91
+ ✓ PASS: Prediction matches ground truth
92
+
93
+ [... continues for all slides ...]
94
+
95
+ =========================================
96
+ Test Summary
97
+ =========================================
98
+ Total slides: 3
99
+ Passed: 3
100
+ Failed: 0
101
+
102
+ All tests passed!
103
+ ```
104
+
105
+ ### 3. verify_aeon_results.py
106
+
107
+ Verify Aeon test results against expected ground truth.
108
+
109
+ **Usage:**
110
+ ```bash
111
+ python scripts/verify_aeon_results.py \
112
+ --test-samples test_slides/test_samples.json \
113
+ --results-dir test_slides/results \
114
+ --output test_slides/verification_report.json
115
+ ```
116
+
117
+ **Arguments:**
118
+ - `--test-samples`: Path to test samples JSON file (default: test_slides/test_samples.json)
119
+ - `--results-dir`: Directory containing results (default: test_slides/results)
120
+ - `--output`: Optional path to save verification report as JSON
121
+
122
+ **Example:**
123
+ ```bash
124
+ # Verify results and save report
125
+ uv run python scripts/verify_aeon_results.py \
126
+ --output test_slides/verification_report.json
127
+
128
+ # Output:
129
+ # ================================================================================
130
+ # Aeon Model Verification Report
131
+ # ================================================================================
132
+ #
133
+ # Slide: 881837
134
+ # Ground Truth: BLCA
135
+ # Site Type: Primary
136
+ # Sex: Male
137
+ # Tissue Site: Bladder
138
+ # Predicted: BLCA
139
+ # Confidence: 0.9819 (98.19%)
140
+ # Status: ✓ PASS
141
+ #
142
+ # [... continues for all slides ...]
143
+ #
144
+ # ================================================================================
145
+ # Summary
146
+ # ================================================================================
147
+ # Total slides: 3
148
+ # Passed: 3 (100.0%)
149
+ # Failed: 0 (0.0%)
150
+ #
151
+ # ✓ All tests passed!
152
+ #
153
+ # Confidence Statistics (for passed tests):
154
+ # Average: 0.9910 (99.10%)
155
+ # Minimum: 0.9819 (98.19%)
156
+ # Maximum: 0.9961 (99.61%)
157
+ ```
158
+
159
+ ## Workflow
160
+
161
+ ### Complete Testing Workflow
162
+
163
+ 1. **Export checkpoint** (if needed):
164
+ ```bash
165
+ uv run python scripts/export_aeon_checkpoint.py \
166
+ --checkpoint data/checkpoint.ckpt \
167
+ --output data/aeon_model.pkl
168
+ ```
169
+
170
+ 2. **Run tests**:
171
+ ```bash
172
+ ./scripts/run_aeon_tests.sh
173
+ ```
174
+
175
+ 3. **Verify results**:
176
+ ```bash
177
+ uv run python scripts/verify_aeon_results.py \
178
+ --output test_slides/verification_report.json
179
+ ```
180
+
181
+ ### Quick Verification
182
+
183
+ If you already have test results and just want to verify them:
184
+
185
+ ```bash
186
+ uv run python scripts/verify_aeon_results.py
187
+ ```
188
+
189
+ ## Test Samples Format
190
+
191
+ The test samples JSON file should have this format:
192
+
193
+ ```json
194
+ [
195
+ {
196
+ "slide_id": "881837",
197
+ "cancer_subtype": "BLCA",
198
+ "site_type": "Primary",
199
+ "sex": "Male",
200
+ "tissue_site": "Bladder"
201
+ },
202
+ {
203
+ "slide_id": "744547",
204
+ "cancer_subtype": "HCC",
205
+ "site_type": "Metastatic",
206
+ "sex": "Male",
207
+ "tissue_site": "Liver"
208
+ }
209
+ ]
210
+ ```
211
+
212
+ ## Dependencies
213
+
214
+ All scripts require:
215
+ - Python 3.10+
216
+ - uv package manager
217
+ - Mosaic package with dependencies
218
+
219
+ Additional requirements for checkpoint export:
220
+ - paladin from git repository (dev branch)
221
+ - PyTorch Lightning
222
+
223
+ ## Exit Codes
224
+
225
+ - `0`: Success (all tests passed)
226
+ - `1`: Failure (one or more tests failed)
227
+
228
+ ## Troubleshooting
229
+
230
+ ### "AeonLightningModule not found"
231
+ ```bash
232
+ uv sync --upgrade-package paladin
233
+ ```
234
+
235
+ ### "Metadata files not found"
236
+ Make sure you have:
237
+ - `data/metadata/n_classes.txt`
238
+ - `data/metadata/ontology_embedding_dim.txt`
239
+ - `data/metadata/target_dict.tsv`
240
+
241
+ ### "Test slides not found"
242
+ Place your test slides in `test_slides/` directory and update `test_samples.json` with correct paths.
243
+
244
+ ## See Also
245
+
246
+ - [AEON_TEST_SUMMARY.md](../test_slides/AEON_TEST_SUMMARY.md) - Detailed test results and validation
247
+ - [README.md](../README.md) - Main Mosaic documentation
scripts/export_aeon_checkpoint.py ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ """
3
+ Export Aeon PyTorch Lightning checkpoint to pickle format for inference.
4
+
5
+ This script converts a PyTorch Lightning checkpoint (.ckpt) file to a pickle
6
+ (.pkl) file that can be used with the Mosaic inference pipeline.
7
+
8
+ Usage:
9
+ python export_aeon_checkpoint.py \
10
+ --checkpoint data/checkpoint.ckpt \
11
+ --output data/aeon_model.pkl \
12
+ --metadata-dir data/metadata
13
+
14
+ Requirements:
15
+ - paladin package from git repo (must have AeonLightningModule)
16
+ - PyTorch Lightning
17
+ - Access to metadata files (n_classes.txt, ontology_embedding_dim.txt, target_dict.tsv)
18
+ """
19
+
20
+ import argparse
21
+ import json
22
+ import pickle
23
+ from pathlib import Path
24
+
25
+
26
+ def load_metadata(metadata_dir: Path):
27
+ """Load metadata required for model initialization.
28
+
29
+ Args:
30
+ metadata_dir: Directory containing metadata files
31
+
32
+ Returns:
33
+ SimpleMetadata object with n_classes, ontology_embedding_dim, and target_dicts
34
+ """
35
+ # Read n_classes
36
+ with open(metadata_dir / "n_classes.txt") as f:
37
+ n_classes = int(f.read().strip())
38
+
39
+ # Read ontology_embedding_dim
40
+ with open(metadata_dir / "ontology_embedding_dim.txt") as f:
41
+ ontology_embedding_dim = int(f.read().strip())
42
+
43
+ # Read target_dict (JSON format with single quotes)
44
+ with open(metadata_dir / "target_dict.tsv") as f:
45
+ target_dict_str = f.read().strip().replace("'", '"')
46
+ target_dict = json.loads(target_dict_str)
47
+
48
+ # Create simple metadata object
49
+ class SimpleMetadata:
50
+ def __init__(self, n_classes, ontology_embedding_dim, target_dict):
51
+ self.n_classes = n_classes
52
+ self.ontology_embedding_dim = ontology_embedding_dim
53
+ self.target_dicts = [target_dict]
54
+
55
+ return SimpleMetadata(n_classes, ontology_embedding_dim, target_dict)
56
+
57
+
58
+ def export_checkpoint(checkpoint_path: Path, output_path: Path, metadata_dir: Path):
59
+ """Export PyTorch Lightning checkpoint to pickle format.
60
+
61
+ Args:
62
+ checkpoint_path: Path to .ckpt file
63
+ output_path: Path to save .pkl file
64
+ metadata_dir: Directory containing metadata files
65
+ """
66
+ try:
67
+ from paladin.pl_modules.aeon import AeonLightningModule
68
+ except ImportError:
69
+ raise ImportError(
70
+ "Failed to import AeonLightningModule. "
71
+ "Make sure paladin is installed from the git repository:\n"
72
+ " uv sync --upgrade-package paladin"
73
+ )
74
+
75
+ print(f"Loading metadata from {metadata_dir}...")
76
+ metadata = load_metadata(metadata_dir)
77
+
78
+ print(f"Loading checkpoint from {checkpoint_path}...")
79
+ pl_module = AeonLightningModule.load_from_checkpoint(
80
+ str(checkpoint_path),
81
+ metadata=metadata
82
+ )
83
+
84
+ # Extract the model
85
+ model = pl_module.model
86
+
87
+ print(f"Saving model to {output_path}...")
88
+ with open(output_path, "wb") as f:
89
+ pickle.dump(model, f)
90
+
91
+ print(f"✓ Successfully exported checkpoint to {output_path}")
92
+
93
+ # Print model info
94
+ file_size = output_path.stat().st_size / (1024 * 1024) # MB
95
+ print(f" Model size: {file_size:.1f} MB")
96
+ print(f" Model class: {type(model).__name__}")
97
+ print(f" Number of classes: {metadata.n_classes}")
98
+ print(f" Ontology embedding dim: {metadata.ontology_embedding_dim}")
99
+ print(f" Number of histologies: {len(metadata.target_dicts[0]['histologies'])}")
100
+
101
+
102
+ def main():
103
+ parser = argparse.ArgumentParser(
104
+ description="Export Aeon PyTorch Lightning checkpoint to pickle format"
105
+ )
106
+ parser.add_argument(
107
+ "--checkpoint",
108
+ type=Path,
109
+ required=True,
110
+ help="Path to PyTorch Lightning checkpoint (.ckpt)"
111
+ )
112
+ parser.add_argument(
113
+ "--output",
114
+ type=Path,
115
+ required=True,
116
+ help="Path to save exported model (.pkl)"
117
+ )
118
+ parser.add_argument(
119
+ "--metadata-dir",
120
+ type=Path,
121
+ default=Path("data/metadata"),
122
+ help="Directory containing metadata files (default: data/metadata)"
123
+ )
124
+
125
+ args = parser.parse_args()
126
+
127
+ # Validate inputs
128
+ if not args.checkpoint.exists():
129
+ raise FileNotFoundError(f"Checkpoint not found: {args.checkpoint}")
130
+
131
+ if not args.metadata_dir.exists():
132
+ raise FileNotFoundError(f"Metadata directory not found: {args.metadata_dir}")
133
+
134
+ # Create output directory if needed
135
+ args.output.parent.mkdir(parents=True, exist_ok=True)
136
+
137
+ # Export checkpoint
138
+ export_checkpoint(args.checkpoint, args.output, args.metadata_dir)
139
+
140
+
141
+ if __name__ == "__main__":
142
+ main()
scripts/run_aeon_tests.sh ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Aeon Model Test Script
3
+ # This script runs the Aeon cancer subtype prediction model on test slides
4
+ # for reproducibility and validation.
5
+
6
+ set -e # Exit on error
7
+
8
+ # Configuration
9
+ TEST_SAMPLES_FILE="test_slides/test_samples.json"
10
+ RESULTS_DIR="test_slides/results"
11
+ LOG_DIR="test_slides/logs"
12
+ SEGMENTATION_CONFIG="Biopsy"
13
+ NUM_WORKERS=4
14
+
15
+ # Colors for output
16
+ GREEN='\033[0;32m'
17
+ YELLOW='\033[1;33m'
18
+ RED='\033[0;31m'
19
+ NC='\033[0m' # No Color
20
+
21
+ echo "========================================="
22
+ echo "Aeon Model Test Suite"
23
+ echo "========================================="
24
+ echo ""
25
+
26
+ # Create directories
27
+ mkdir -p "${RESULTS_DIR}"
28
+ mkdir -p "${LOG_DIR}"
29
+
30
+ # Check if test samples file exists
31
+ if [ ! -f "${TEST_SAMPLES_FILE}" ]; then
32
+ echo -e "${RED}Error: Test samples file not found: ${TEST_SAMPLES_FILE}${NC}"
33
+ exit 1
34
+ fi
35
+
36
+ # Read test samples
37
+ echo "Reading test samples from ${TEST_SAMPLES_FILE}..."
38
+ SLIDE_IDS=$(python3 -c "
39
+ import json
40
+ with open('${TEST_SAMPLES_FILE}') as f:
41
+ samples = json.load(f)
42
+ for sample in samples:
43
+ slide_id = sample.get('slide_id') or sample.get('image_id')
44
+ print(slide_id)
45
+ ")
46
+
47
+ # Count slides
48
+ NUM_SLIDES=$(echo "${SLIDE_IDS}" | wc -l)
49
+ echo -e "${GREEN}Found ${NUM_SLIDES} test slides${NC}"
50
+ echo ""
51
+
52
+ # Process each slide
53
+ CURRENT=0
54
+ PASSED=0
55
+ FAILED=0
56
+
57
+ for SLIDE_ID in ${SLIDE_IDS}; do
58
+ CURRENT=$((CURRENT + 1))
59
+
60
+ echo "========================================="
61
+ echo -e "${YELLOW}Processing slide ${CURRENT}/${NUM_SLIDES}: ${SLIDE_ID}${NC}"
62
+ echo "========================================="
63
+
64
+ # Get slide metadata
65
+ METADATA=$(python3 -c "
66
+ import json
67
+ with open('${TEST_SAMPLES_FILE}') as f:
68
+ samples = json.load(f)
69
+ for sample in samples:
70
+ slide_id = sample.get('slide_id') or sample.get('image_id')
71
+ if slide_id == '${SLIDE_ID}':
72
+ cancer_subtype = sample.get('cancer_subtype') or sample.get('cancer_type')
73
+ print(f\"{cancer_subtype}|{sample['site_type']}|{sample['sex']}|{sample['tissue_site']}\")
74
+ break
75
+ ")
76
+
77
+ IFS='|' read -r CANCER_SUBTYPE SITE_TYPE SEX TISSUE_SITE <<< "${METADATA}"
78
+
79
+ echo "Ground Truth:"
80
+ echo " Cancer Subtype: ${CANCER_SUBTYPE}"
81
+ echo " Site Type: ${SITE_TYPE}"
82
+ echo " Sex: ${SEX}"
83
+ echo " Tissue Site: ${TISSUE_SITE}"
84
+ echo ""
85
+
86
+ # Find slide file
87
+ SLIDE_FILE=$(find test_slides -name "${SLIDE_ID}.svs" -o -name "${SLIDE_ID}.tiff" -o -name "${SLIDE_ID}.ndpi" 2>/dev/null | head -1)
88
+
89
+ if [ -z "${SLIDE_FILE}" ]; then
90
+ echo -e "${RED}Error: Slide file not found for ${SLIDE_ID}${NC}"
91
+ FAILED=$((FAILED + 1))
92
+ continue
93
+ fi
94
+
95
+ echo "Slide file: ${SLIDE_FILE}"
96
+ echo ""
97
+
98
+ # Run Mosaic pipeline with Aeon inference
99
+ LOG_FILE="${LOG_DIR}/${SLIDE_ID}_aeon_test.log"
100
+
101
+ echo "Running Mosaic pipeline..."
102
+ if uv run python -m mosaic.cli \
103
+ --input-slide "${SLIDE_FILE}" \
104
+ --output-dir "${RESULTS_DIR}/${SLIDE_ID}" \
105
+ --cancer-subtype "Unknown" \
106
+ --site-type "${SITE_TYPE}" \
107
+ --sex "${SEX}" \
108
+ --tissue-site "${TISSUE_SITE}" \
109
+ --segmentation-config "${SEGMENTATION_CONFIG}" \
110
+ --num-workers "${NUM_WORKERS}" \
111
+ > "${LOG_FILE}" 2>&1; then
112
+
113
+ # Check if results exist
114
+ AEON_RESULTS="${RESULTS_DIR}/${SLIDE_ID}/${SLIDE_ID}_aeon_results.csv"
115
+
116
+ if [ -f "${AEON_RESULTS}" ]; then
117
+ # Extract prediction
118
+ PREDICTION=$(python3 -c "
119
+ import pandas as pd
120
+ df = pd.read_csv('${AEON_RESULTS}')
121
+ if not df.empty:
122
+ print(f\"{df.iloc[0]['Cancer Subtype']}|{df.iloc[0]['Confidence']:.4f}\")
123
+ ")
124
+
125
+ IFS='|' read -r PRED_SUBTYPE CONFIDENCE <<< "${PREDICTION}"
126
+
127
+ echo ""
128
+ echo "Aeon Prediction:"
129
+ echo " Predicted: ${PRED_SUBTYPE}"
130
+ echo " Confidence: ${CONFIDENCE}"
131
+ echo ""
132
+
133
+ # Check if prediction matches ground truth
134
+ if [ "${PRED_SUBTYPE}" == "${CANCER_SUBTYPE}" ]; then
135
+ echo -e "${GREEN}✓ PASS: Prediction matches ground truth${NC}"
136
+ PASSED=$((PASSED + 1))
137
+ else
138
+ echo -e "${RED}✗ FAIL: Prediction does not match ground truth${NC}"
139
+ echo " Expected: ${CANCER_SUBTYPE}"
140
+ echo " Got: ${PRED_SUBTYPE}"
141
+ FAILED=$((FAILED + 1))
142
+ fi
143
+ else
144
+ echo -e "${RED}✗ FAIL: Aeon results file not found${NC}"
145
+ FAILED=$((FAILED + 1))
146
+ fi
147
+ else
148
+ echo -e "${RED}✗ FAIL: Mosaic pipeline failed${NC}"
149
+ echo "Check log file: ${LOG_FILE}"
150
+ FAILED=$((FAILED + 1))
151
+ fi
152
+
153
+ echo ""
154
+ done
155
+
156
+ # Summary
157
+ echo "========================================="
158
+ echo "Test Summary"
159
+ echo "========================================="
160
+ echo "Total slides: ${NUM_SLIDES}"
161
+ echo -e "${GREEN}Passed: ${PASSED}${NC}"
162
+ if [ ${FAILED} -gt 0 ]; then
163
+ echo -e "${RED}Failed: ${FAILED}${NC}"
164
+ else
165
+ echo "Failed: ${FAILED}"
166
+ fi
167
+ echo ""
168
+
169
+ if [ ${FAILED} -eq 0 ]; then
170
+ echo -e "${GREEN}All tests passed!${NC}"
171
+ exit 0
172
+ else
173
+ echo -e "${RED}Some tests failed. Check logs in ${LOG_DIR}${NC}"
174
+ exit 1
175
+ fi
scripts/verify_aeon_results.py ADDED
@@ -0,0 +1,224 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ """
3
+ Verify Aeon test results against expected ground truth.
4
+
5
+ This script reads the test results and compares them against the ground truth
6
+ values in test_samples.json to validate the Aeon model predictions.
7
+
8
+ Usage:
9
+ python verify_aeon_results.py \
10
+ --test-samples test_slides/test_samples.json \
11
+ --results-dir test_slides/results
12
+ """
13
+
14
+ import argparse
15
+ import json
16
+ from pathlib import Path
17
+ import pandas as pd
18
+ from typing import Dict, List, Tuple
19
+
20
+
21
+ def load_test_samples(test_samples_file: Path) -> List[Dict]:
22
+ """Load test samples from JSON file.
23
+
24
+ Args:
25
+ test_samples_file: Path to test_samples.json
26
+
27
+ Returns:
28
+ List of test sample dictionaries
29
+ """
30
+ with open(test_samples_file) as f:
31
+ return json.load(f)
32
+
33
+
34
+ def load_aeon_results(slide_id: str, results_dir: Path) -> Tuple[str, float]:
35
+ """Load Aeon prediction results for a slide.
36
+
37
+ Args:
38
+ slide_id: Slide identifier
39
+ results_dir: Directory containing results
40
+
41
+ Returns:
42
+ Tuple of (predicted_subtype, confidence)
43
+ """
44
+ results_file = results_dir / slide_id / f"{slide_id}_aeon_results.csv"
45
+
46
+ if not results_file.exists():
47
+ raise FileNotFoundError(f"Results file not found: {results_file}")
48
+
49
+ df = pd.read_csv(results_file)
50
+
51
+ if df.empty:
52
+ raise ValueError(f"Empty results file: {results_file}")
53
+
54
+ # Get top prediction
55
+ top_prediction = df.iloc[0]
56
+ return top_prediction["Cancer Subtype"], top_prediction["Confidence"]
57
+
58
+
59
+ def verify_results(test_samples: List[Dict], results_dir: Path) -> Dict:
60
+ """Verify all test results against ground truth.
61
+
62
+ Args:
63
+ test_samples: List of test sample dictionaries
64
+ results_dir: Directory containing results
65
+
66
+ Returns:
67
+ Dictionary with verification statistics
68
+ """
69
+ total = len(test_samples)
70
+ passed = 0
71
+ failed = 0
72
+ results = []
73
+
74
+ print("=" * 80)
75
+ print("Aeon Model Verification Report")
76
+ print("=" * 80)
77
+ print()
78
+
79
+ for sample in test_samples:
80
+ slide_id = sample.get("slide_id") or sample.get("image_id")
81
+ ground_truth = sample.get("cancer_subtype") or sample.get("cancer_type")
82
+ site_type = sample["site_type"]
83
+ sex = sample["sex"]
84
+ tissue_site = sample["tissue_site"]
85
+
86
+ print(f"Slide: {slide_id}")
87
+ print(f" Ground Truth: {ground_truth}")
88
+ print(f" Site Type: {site_type}")
89
+ print(f" Sex: {sex}")
90
+ print(f" Tissue Site: {tissue_site}")
91
+
92
+ try:
93
+ predicted, confidence = load_aeon_results(slide_id, results_dir)
94
+
95
+ print(f" Predicted: {predicted}")
96
+ print(f" Confidence: {confidence:.4f} ({confidence * 100:.2f}%)")
97
+
98
+ # Check if prediction matches
99
+ if predicted == ground_truth:
100
+ print(" Status: ✓ PASS")
101
+ passed += 1
102
+ status = "PASS"
103
+ else:
104
+ print(f" Status: ✗ FAIL (expected {ground_truth}, got {predicted})")
105
+ failed += 1
106
+ status = "FAIL"
107
+
108
+ results.append({
109
+ "slide_id": slide_id,
110
+ "ground_truth": ground_truth,
111
+ "predicted": predicted,
112
+ "confidence": confidence,
113
+ "site_type": site_type,
114
+ "sex": sex,
115
+ "tissue_site": tissue_site,
116
+ "status": status
117
+ })
118
+
119
+ except Exception as e:
120
+ print(f" Status: ✗ ERROR - {e}")
121
+ failed += 1
122
+ results.append({
123
+ "slide_id": slide_id,
124
+ "ground_truth": ground_truth,
125
+ "predicted": None,
126
+ "confidence": None,
127
+ "site_type": site_type,
128
+ "sex": sex,
129
+ "tissue_site": tissue_site,
130
+ "status": "ERROR",
131
+ "error": str(e)
132
+ })
133
+
134
+ print()
135
+
136
+ # Print summary
137
+ print("=" * 80)
138
+ print("Summary")
139
+ print("=" * 80)
140
+ print(f"Total slides: {total}")
141
+ print(f"Passed: {passed} ({passed / total * 100:.1f}%)")
142
+ print(f"Failed: {failed} ({failed / total * 100:.1f}%)")
143
+ print()
144
+
145
+ if passed == total:
146
+ print("✓ All tests passed!")
147
+ else:
148
+ print(f"✗ {failed} test(s) failed")
149
+
150
+ # Calculate statistics for passed tests
151
+ if passed > 0:
152
+ confidences = [r["confidence"] for r in results if r["status"] == "PASS"]
153
+ avg_confidence = sum(confidences) / len(confidences)
154
+ min_confidence = min(confidences)
155
+ max_confidence = max(confidences)
156
+
157
+ print()
158
+ print("Confidence Statistics (for passed tests):")
159
+ print(f" Average: {avg_confidence:.4f} ({avg_confidence * 100:.2f}%)")
160
+ print(f" Minimum: {min_confidence:.4f} ({min_confidence * 100:.2f}%)")
161
+ print(f" Maximum: {max_confidence:.4f} ({max_confidence * 100:.2f}%)")
162
+
163
+ return {
164
+ "total": total,
165
+ "passed": passed,
166
+ "failed": failed,
167
+ "accuracy": passed / total if total > 0 else 0,
168
+ "results": results
169
+ }
170
+
171
+
172
+ def main():
173
+ parser = argparse.ArgumentParser(
174
+ description="Verify Aeon test results against ground truth"
175
+ )
176
+ parser.add_argument(
177
+ "--test-samples",
178
+ type=Path,
179
+ default=Path("test_slides/test_samples.json"),
180
+ help="Path to test_samples.json (default: test_slides/test_samples.json)"
181
+ )
182
+ parser.add_argument(
183
+ "--results-dir",
184
+ type=Path,
185
+ default=Path("test_slides/results"),
186
+ help="Directory containing results (default: test_slides/results)"
187
+ )
188
+ parser.add_argument(
189
+ "--output",
190
+ type=Path,
191
+ help="Optional path to save verification report as JSON"
192
+ )
193
+
194
+ args = parser.parse_args()
195
+
196
+ # Validate inputs
197
+ if not args.test_samples.exists():
198
+ raise FileNotFoundError(f"Test samples file not found: {args.test_samples}")
199
+
200
+ if not args.results_dir.exists():
201
+ raise FileNotFoundError(f"Results directory not found: {args.results_dir}")
202
+
203
+ # Load test samples
204
+ test_samples = load_test_samples(args.test_samples)
205
+
206
+ # Verify results
207
+ verification_report = verify_results(test_samples, args.results_dir)
208
+
209
+ # Save report if requested
210
+ if args.output:
211
+ with open(args.output, "w") as f:
212
+ json.dump(verification_report, f, indent=2)
213
+ print()
214
+ print(f"Verification report saved to: {args.output}")
215
+
216
+ # Exit with appropriate code
217
+ if verification_report["failed"] > 0:
218
+ exit(1)
219
+ else:
220
+ exit(0)
221
+
222
+
223
+ if __name__ == "__main__":
224
+ main()
src/mosaic/analysis.py CHANGED
@@ -154,13 +154,15 @@ def _extract_optimus_features(filtered_coords, slide_path, attrs, num_workers):
154
  return features
155
 
156
 
157
- def _run_aeon_inference(features, site_type, num_workers):
158
  """Run Aeon cancer subtype inference on GPU.
159
 
160
  Args:
161
  features: Optimus features
162
  site_type: Site type ("Primary" or "Metastatic")
163
  num_workers: Number of worker processes
 
 
164
 
165
  Returns:
166
  Aeon results DataFrame
@@ -183,6 +185,8 @@ def _run_aeon_inference(features, site_type, num_workers):
183
  metastatic=(site_type == "Metastatic"),
184
  batch_size=8,
185
  num_workers=num_workers,
 
 
186
  use_cpu=False,
187
  )
188
  end_time = pd.Timestamp.now()
@@ -260,6 +264,8 @@ def _run_inference_pipeline_free(
260
  slide_path,
261
  attrs,
262
  site_type,
 
 
263
  cancer_subtype,
264
  cancer_subtype_name_map,
265
  num_workers,
@@ -267,8 +273,8 @@ def _run_inference_pipeline_free(
267
  ):
268
  """Run inference pipeline with 60s GPU limit (for free users)."""
269
  return _run_inference_pipeline_impl(
270
- coords, slide_path, attrs, site_type, cancer_subtype,
271
- cancer_subtype_name_map, num_workers, progress
272
  )
273
 
274
 
@@ -278,6 +284,8 @@ def _run_inference_pipeline_pro(
278
  slide_path,
279
  attrs,
280
  site_type,
 
 
281
  cancer_subtype,
282
  cancer_subtype_name_map,
283
  num_workers,
@@ -285,8 +293,8 @@ def _run_inference_pipeline_pro(
285
  ):
286
  """Run inference pipeline with 300s GPU limit (for PRO users)."""
287
  return _run_inference_pipeline_impl(
288
- coords, slide_path, attrs, site_type, cancer_subtype,
289
- cancer_subtype_name_map, num_workers, progress
290
  )
291
 
292
 
@@ -295,6 +303,8 @@ def _run_inference_pipeline_impl(
295
  slide_path,
296
  attrs,
297
  site_type,
 
 
298
  cancer_subtype,
299
  cancer_subtype_name_map,
300
  num_workers,
@@ -351,7 +361,7 @@ def _run_inference_pipeline_impl(
351
  # Step 5: Run Aeon to predict histology if not supplied
352
  if cancer_subtype == "Unknown":
353
  progress(0.9, desc="Running Aeon for cancer subtype inference")
354
- aeon_results = _run_aeon_inference(features, site_type, num_workers)
355
  else:
356
  cancer_subtype_code = cancer_subtype_name_map.get(cancer_subtype)
357
  aeon_results = pd.DataFrame(
@@ -379,6 +389,8 @@ def analyze_slide(
379
  slide_path,
380
  seg_config,
381
  site_type,
 
 
382
  cancer_subtype,
383
  cancer_subtype_name_map,
384
  ihc_subtype="",
@@ -507,6 +519,17 @@ def analyze_slide(
507
  import traceback
508
  logger.warning(traceback.format_exc())
509
 
 
 
 
 
 
 
 
 
 
 
 
510
  if is_logged_in:
511
  logger.info("Using 300s GPU allocation (logged-in user)")
512
  aeon_results, paladin_results = _run_inference_pipeline_pro(
@@ -514,6 +537,8 @@ def analyze_slide(
514
  slide_path,
515
  attrs,
516
  site_type,
 
 
517
  cancer_subtype,
518
  cancer_subtype_name_map,
519
  num_workers,
@@ -526,6 +551,8 @@ def analyze_slide(
526
  slide_path,
527
  attrs,
528
  site_type,
 
 
529
  cancer_subtype,
530
  cancer_subtype_name_map,
531
  num_workers,
 
154
  return features
155
 
156
 
157
+ def _run_aeon_inference(features, site_type, num_workers, sex=None, tissue_site_idx=None):
158
  """Run Aeon cancer subtype inference on GPU.
159
 
160
  Args:
161
  features: Optimus features
162
  site_type: Site type ("Primary" or "Metastatic")
163
  num_workers: Number of worker processes
164
+ sex: Patient sex (0=Male, 1=Female), optional
165
+ tissue_site_idx: Tissue site index (0-56), optional
166
 
167
  Returns:
168
  Aeon results DataFrame
 
185
  metastatic=(site_type == "Metastatic"),
186
  batch_size=8,
187
  num_workers=num_workers,
188
+ sex=sex,
189
+ tissue_site_idx=tissue_site_idx,
190
  use_cpu=False,
191
  )
192
  end_time = pd.Timestamp.now()
 
264
  slide_path,
265
  attrs,
266
  site_type,
267
+ sex,
268
+ tissue_site_idx,
269
  cancer_subtype,
270
  cancer_subtype_name_map,
271
  num_workers,
 
273
  ):
274
  """Run inference pipeline with 60s GPU limit (for free users)."""
275
  return _run_inference_pipeline_impl(
276
+ coords, slide_path, attrs, site_type, sex, tissue_site_idx,
277
+ cancer_subtype, cancer_subtype_name_map, num_workers, progress
278
  )
279
 
280
 
 
284
  slide_path,
285
  attrs,
286
  site_type,
287
+ sex,
288
+ tissue_site_idx,
289
  cancer_subtype,
290
  cancer_subtype_name_map,
291
  num_workers,
 
293
  ):
294
  """Run inference pipeline with 300s GPU limit (for PRO users)."""
295
  return _run_inference_pipeline_impl(
296
+ coords, slide_path, attrs, site_type, sex, tissue_site_idx,
297
+ cancer_subtype, cancer_subtype_name_map, num_workers, progress
298
  )
299
 
300
 
 
303
  slide_path,
304
  attrs,
305
  site_type,
306
+ sex,
307
+ tissue_site_idx,
308
  cancer_subtype,
309
  cancer_subtype_name_map,
310
  num_workers,
 
361
  # Step 5: Run Aeon to predict histology if not supplied
362
  if cancer_subtype == "Unknown":
363
  progress(0.9, desc="Running Aeon for cancer subtype inference")
364
+ aeon_results = _run_aeon_inference(features, site_type, num_workers, sex, tissue_site_idx)
365
  else:
366
  cancer_subtype_code = cancer_subtype_name_map.get(cancer_subtype)
367
  aeon_results = pd.DataFrame(
 
389
  slide_path,
390
  seg_config,
391
  site_type,
392
+ sex,
393
+ tissue_site,
394
  cancer_subtype,
395
  cancer_subtype_name_map,
396
  ihc_subtype="",
 
519
  import traceback
520
  logger.warning(traceback.format_exc())
521
 
522
+ # Convert sex and tissue_site to indices for Aeon model
523
+ from mosaic.inference.data import encode_sex, encode_tissue_site
524
+
525
+ sex_idx = None
526
+ if sex is not None:
527
+ sex_idx = encode_sex(sex)
528
+
529
+ tissue_site_idx = None
530
+ if tissue_site is not None:
531
+ tissue_site_idx = encode_tissue_site(tissue_site)
532
+
533
  if is_logged_in:
534
  logger.info("Using 300s GPU allocation (logged-in user)")
535
  aeon_results, paladin_results = _run_inference_pipeline_pro(
 
537
  slide_path,
538
  attrs,
539
  site_type,
540
+ sex_idx,
541
+ tissue_site_idx,
542
  cancer_subtype,
543
  cancer_subtype_name_map,
544
  num_workers,
 
551
  slide_path,
552
  attrs,
553
  site_type,
554
+ sex_idx,
555
+ tissue_site_idx,
556
  cancer_subtype,
557
  cancer_subtype_name_map,
558
  num_workers,
src/mosaic/gradio_app.py CHANGED
@@ -21,6 +21,7 @@ from mosaic.ui.utils import (
21
  validate_settings,
22
  IHC_SUBTYPES,
23
  SETTINGS_COLUMNS,
 
24
  )
25
  from mosaic.analysis import analyze_slide
26
 
@@ -43,10 +44,10 @@ def download_and_process_models():
43
  "data/paladin_model_map.csv",
44
  )
45
  cancer_subtypes = model_map["cancer_subtype"].unique().tolist()
46
- cancer_subtype_name_map = {
 
47
  f"{get_oncotree_code_name(code)} ({code})": code for code in cancer_subtypes
48
- }
49
- cancer_subtype_name_map["Unknown"] = "UNK"
50
  reversed_cancer_subtype_name_map = {
51
  value: key for key, value in cancer_subtype_name_map.items()
52
  }
@@ -99,6 +100,19 @@ def main():
99
  default="Primary",
100
  help="Site type of the slide (for single slide processing)",
101
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  parser.add_argument(
103
  "--cancer-subtype",
104
  type=str,
@@ -144,6 +158,8 @@ def main():
144
  [
145
  args.slide_path,
146
  args.site_type,
 
 
147
  args.cancer_subtype,
148
  args.ihc_subtype,
149
  args.segmentation_config,
@@ -156,6 +172,8 @@ def main():
156
  args.slide_path,
157
  args.segmentation_config,
158
  args.site_type,
 
 
159
  args.cancer_subtype,
160
  cancer_subtype_name_map,
161
  args.ihc_subtype,
@@ -191,6 +209,8 @@ def main():
191
  slide_path = row["Slide"]
192
  seg_config = row["Segmentation Config"]
193
  site_type = row["Site Type"]
 
 
194
  cancer_subtype = row["Cancer Subtype"]
195
  ihc_subtype = row.get("IHC Subtype", "")
196
  logger.info(
@@ -200,6 +220,8 @@ def main():
200
  slide_path,
201
  seg_config,
202
  site_type,
 
 
203
  cancer_subtype,
204
  cancer_subtype_name_map,
205
  ihc_subtype,
 
21
  validate_settings,
22
  IHC_SUBTYPES,
23
  SETTINGS_COLUMNS,
24
+ SEX_OPTIONS,
25
  )
26
  from mosaic.analysis import analyze_slide
27
 
 
44
  "data/paladin_model_map.csv",
45
  )
46
  cancer_subtypes = model_map["cancer_subtype"].unique().tolist()
47
+ cancer_subtype_name_map = {"Unknown": "UNK"}
48
+ cancer_subtype_name_map.update({
49
  f"{get_oncotree_code_name(code)} ({code})": code for code in cancer_subtypes
50
+ })
 
51
  reversed_cancer_subtype_name_map = {
52
  value: key for key, value in cancer_subtype_name_map.items()
53
  }
 
100
  default="Primary",
101
  help="Site type of the slide (for single slide processing)",
102
  )
103
+ parser.add_argument(
104
+ "--sex",
105
+ type=str,
106
+ choices=SEX_OPTIONS,
107
+ default="Unknown",
108
+ help="Sex of the patient (for single slide processing)",
109
+ )
110
+ parser.add_argument(
111
+ "--tissue-site",
112
+ type=str,
113
+ default="Unknown",
114
+ help="Tissue site of the slide (for single slide processing)",
115
+ )
116
  parser.add_argument(
117
  "--cancer-subtype",
118
  type=str,
 
158
  [
159
  args.slide_path,
160
  args.site_type,
161
+ args.sex,
162
+ args.tissue_site,
163
  args.cancer_subtype,
164
  args.ihc_subtype,
165
  args.segmentation_config,
 
172
  args.slide_path,
173
  args.segmentation_config,
174
  args.site_type,
175
+ args.sex,
176
+ args.tissue_site,
177
  args.cancer_subtype,
178
  cancer_subtype_name_map,
179
  args.ihc_subtype,
 
209
  slide_path = row["Slide"]
210
  seg_config = row["Segmentation Config"]
211
  site_type = row["Site Type"]
212
+ sex = row.get("Sex", "Unknown")
213
+ tissue_site = row.get("Tissue Site", "Unknown")
214
  cancer_subtype = row["Cancer Subtype"]
215
  ihc_subtype = row.get("IHC Subtype", "")
216
  logger.info(
 
220
  slide_path,
221
  seg_config,
222
  site_type,
223
+ sex,
224
+ tissue_site,
225
  cancer_subtype,
226
  cancer_subtype_name_map,
227
  ihc_subtype,
src/mosaic/inference/aeon.py CHANGED
@@ -4,6 +4,7 @@ This module provides functionality to run the Aeon deep learning model
4
  for predicting cancer subtypes from H&E whole slide image features.
5
  """
6
 
 
7
  import pickle # nosec
8
  import sys
9
  from argparse import ArgumentParser
@@ -16,36 +17,21 @@ from torch.utils.data import DataLoader
16
  from mosaic.inference.data import (
17
  SiteType,
18
  TileFeatureTensorDataset,
19
- INT_TO_CANCER_TYPE_MAP,
20
- CANCER_TYPE_TO_INT_MAP,
21
  )
22
 
23
  from loguru import logger
24
 
25
  # Cancer types excluded from prediction (too broad or ambiguous)
26
- cancer_types_to_drop = [
 
27
  "UDMN",
28
  "ADNOS",
29
  "CUP",
30
  "CUPNOS",
31
- "BRCNOS",
32
- "GNOS",
33
- "SCCNOS",
34
- "PDC",
35
- "NSCLC",
36
- "BRCA",
37
- "SARCNOS",
38
- "NETNOS",
39
- "MEL",
40
- "RCC",
41
- "BRCANOS",
42
- "COADREAD",
43
- "MUP",
44
- "NECNOS",
45
- "UCEC",
46
  "NOT",
47
  ]
48
- col_indices_to_drop = [CANCER_TYPE_TO_INT_MAP[x] for x in cancer_types_to_drop]
49
 
50
 
51
  BATCH_SIZE = 8
@@ -53,10 +39,11 @@ NUM_WORKERS = 8
53
 
54
 
55
  def run(
56
- features, model_path, metastatic=False, batch_size=8, num_workers=8, use_cpu=False
 
57
  ):
58
  """Run Aeon model inference for cancer subtype prediction.
59
-
60
  Args:
61
  features: NumPy array of tile features extracted from the WSI
62
  model_path: Path to the pickled Aeon model file
@@ -64,7 +51,9 @@ def run(
64
  batch_size: Batch size for inference
65
  num_workers: Number of workers for data loading
66
  use_cpu: Force CPU usage instead of GPU
67
-
 
 
68
  Returns:
69
  tuple: (results_df, part_embedding)
70
  - results_df: DataFrame with cancer subtypes and confidence scores
@@ -79,12 +68,27 @@ def run(
79
  model.to(device)
80
  model.eval()
81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  site_type = SiteType.METASTASIS if metastatic else SiteType.PRIMARY
83
 
84
  # For UI, InferenceDataset will just be a single slide. Sample id is not relevant.
85
  dataset = TileFeatureTensorDataset(
86
  site_type=site_type,
87
  tile_features=features,
 
 
88
  n_max_tiles=20000,
89
  )
90
  dataloader = DataLoader(
@@ -95,15 +99,19 @@ def run(
95
  batch = next(iter(dataloader))
96
  with torch.no_grad():
97
  batch["tile_tensor"] = batch["tile_tensor"].to(device)
 
 
 
 
98
  y = model(batch)
99
- y["logits"][:, col_indices_to_drop] = -1e6
100
 
101
  batch_size = y["logits"].shape[0]
102
  assert batch_size == 1
103
 
104
  softmax = torch.nn.functional.softmax(y["logits"][0], dim=0)
105
  argmax = torch.argmax(softmax, dim=0)
106
- class_assignment = INT_TO_CANCER_TYPE_MAP[argmax.item()]
107
  max_confidence = softmax[argmax].item()
108
  mean_confidence = torch.mean(softmax).item()
109
 
@@ -114,7 +122,7 @@ def run(
114
 
115
  part_embedding = y["whole_part_representation"][0].cpu()
116
 
117
- for cancer_subtype, j in sorted(CANCER_TYPE_TO_INT_MAP.items()):
118
  confidence = softmax[j].item()
119
  results.append((cancer_subtype, confidence))
120
  results.sort(key=lambda row: row[1], reverse=True)
@@ -153,6 +161,19 @@ def parse_args():
153
  parser.add_argument(
154
  "--metastatic", action="store_true", help="Tissue is from a metastatic site"
155
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
156
  parser.add_argument("--batch-size", type=int, default=BATCH_SIZE, help="Batch size")
157
  parser.add_argument(
158
  "--num-workers", type=int, default=NUM_WORKERS, help="Number of workers"
@@ -174,6 +195,17 @@ def main():
174
 
175
  features = torch.load(opt.features_path)
176
 
 
 
 
 
 
 
 
 
 
 
 
177
  results_df, part_embedding = run(
178
  features=features,
179
  model_path=opt.model_path,
@@ -181,6 +213,8 @@ def main():
181
  batch_size=opt.batch_size,
182
  num_workers=opt.num_workers,
183
  use_cpu=opt.use_cpu,
 
 
184
  )
185
 
186
  results_df.to_csv(output_path, index=False)
 
4
  for predicting cancer subtypes from H&E whole slide image features.
5
  """
6
 
7
+ import json
8
  import pickle # nosec
9
  import sys
10
  from argparse import ArgumentParser
 
17
  from mosaic.inference.data import (
18
  SiteType,
19
  TileFeatureTensorDataset,
20
+ encode_sex,
21
+ encode_tissue_site,
22
  )
23
 
24
  from loguru import logger
25
 
26
  # Cancer types excluded from prediction (too broad or ambiguous)
27
+ # These are used to mask out predictions for overly general cancer types
28
+ CANCER_TYPES_TO_DROP = [
29
  "UDMN",
30
  "ADNOS",
31
  "CUP",
32
  "CUPNOS",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  "NOT",
34
  ]
 
35
 
36
 
37
  BATCH_SIZE = 8
 
39
 
40
 
41
  def run(
42
+ features, model_path, metastatic=False, batch_size=8, num_workers=8, use_cpu=False,
43
+ sex=None, tissue_site_idx=None
44
  ):
45
  """Run Aeon model inference for cancer subtype prediction.
46
+
47
  Args:
48
  features: NumPy array of tile features extracted from the WSI
49
  model_path: Path to the pickled Aeon model file
 
51
  batch_size: Batch size for inference
52
  num_workers: Number of workers for data loading
53
  use_cpu: Force CPU usage instead of GPU
54
+ sex: Patient sex (0=Male, 1=Female), optional
55
+ tissue_site_idx: Tissue site index (0-56), optional
56
+
57
  Returns:
58
  tuple: (results_df, part_embedding)
59
  - results_df: DataFrame with cancer subtypes and confidence scores
 
68
  model.to(device)
69
  model.eval()
70
 
71
+ # Load the correct mapping from metadata for this model
72
+ metadata_path = Path(__file__).parent.parent.parent.parent / "data" / "metadata" / "target_dict.tsv"
73
+ with open(metadata_path) as f:
74
+ target_dict_str = f.read().strip().replace("'", '"')
75
+ target_dict = json.loads(target_dict_str)
76
+
77
+ histologies = target_dict['histologies']
78
+ INT_TO_CANCER_TYPE_MAP_LOCAL = {i: histology for i, histology in enumerate(histologies)}
79
+ CANCER_TYPE_TO_INT_MAP_LOCAL = {v: k for k, v in INT_TO_CANCER_TYPE_MAP_LOCAL.items()}
80
+
81
+ # Calculate col_indices_to_drop using local mapping
82
+ col_indices_to_drop_local = [CANCER_TYPE_TO_INT_MAP_LOCAL[x] for x in CANCER_TYPES_TO_DROP if x in CANCER_TYPE_TO_INT_MAP_LOCAL]
83
+
84
  site_type = SiteType.METASTASIS if metastatic else SiteType.PRIMARY
85
 
86
  # For UI, InferenceDataset will just be a single slide. Sample id is not relevant.
87
  dataset = TileFeatureTensorDataset(
88
  site_type=site_type,
89
  tile_features=features,
90
+ sex=sex,
91
+ tissue_site_idx=tissue_site_idx,
92
  n_max_tiles=20000,
93
  )
94
  dataloader = DataLoader(
 
99
  batch = next(iter(dataloader))
100
  with torch.no_grad():
101
  batch["tile_tensor"] = batch["tile_tensor"].to(device)
102
+ if "SEX" in batch:
103
+ batch["SEX"] = batch["SEX"].to(device)
104
+ if "TISSUE_SITE" in batch:
105
+ batch["TISSUE_SITE"] = batch["TISSUE_SITE"].to(device)
106
  y = model(batch)
107
+ y["logits"][:, col_indices_to_drop_local] = -1e6
108
 
109
  batch_size = y["logits"].shape[0]
110
  assert batch_size == 1
111
 
112
  softmax = torch.nn.functional.softmax(y["logits"][0], dim=0)
113
  argmax = torch.argmax(softmax, dim=0)
114
+ class_assignment = INT_TO_CANCER_TYPE_MAP_LOCAL[argmax.item()]
115
  max_confidence = softmax[argmax].item()
116
  mean_confidence = torch.mean(softmax).item()
117
 
 
122
 
123
  part_embedding = y["whole_part_representation"][0].cpu()
124
 
125
+ for cancer_subtype, j in sorted(CANCER_TYPE_TO_INT_MAP_LOCAL.items()):
126
  confidence = softmax[j].item()
127
  results.append((cancer_subtype, confidence))
128
  results.sort(key=lambda row: row[1], reverse=True)
 
161
  parser.add_argument(
162
  "--metastatic", action="store_true", help="Tissue is from a metastatic site"
163
  )
164
+ parser.add_argument(
165
+ "--sex",
166
+ type=str,
167
+ choices=["Male", "Female", "Unknown"],
168
+ default=None,
169
+ help="Patient sex (Male or Female)",
170
+ )
171
+ parser.add_argument(
172
+ "--tissue-site",
173
+ type=str,
174
+ default=None,
175
+ help="Tissue site name",
176
+ )
177
  parser.add_argument("--batch-size", type=int, default=BATCH_SIZE, help="Batch size")
178
  parser.add_argument(
179
  "--num-workers", type=int, default=NUM_WORKERS, help="Number of workers"
 
195
 
196
  features = torch.load(opt.features_path)
197
 
198
+ # Encode sex and tissue site if provided
199
+ sex_encoded = None
200
+ if opt.sex:
201
+ sex_encoded = encode_sex(opt.sex)
202
+ logger.info(f"Using sex: {opt.sex} (encoded as {sex_encoded})")
203
+
204
+ tissue_site_idx = None
205
+ if opt.tissue_site:
206
+ tissue_site_idx = encode_tissue_site(opt.tissue_site)
207
+ logger.info(f"Using tissue site: {opt.tissue_site} (encoded as {tissue_site_idx})")
208
+
209
  results_df, part_embedding = run(
210
  features=features,
211
  model_path=opt.model_path,
 
213
  batch_size=opt.batch_size,
214
  num_workers=opt.num_workers,
215
  use_cpu=opt.use_cpu,
216
+ sex=sex_encoded,
217
+ tissue_site_idx=tissue_site_idx,
218
  )
219
 
220
  results_df.to_csv(output_path, index=False)
src/mosaic/inference/data.py CHANGED
@@ -201,6 +201,129 @@ CANCER_TYPE_TO_INT_MAP = {
201
  INT_TO_CANCER_TYPE_MAP = {v: k for k, v in CANCER_TYPE_TO_INT_MAP.items()}
202
 
203
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
  class SiteType(Enum):
205
  PRIMARY = "Primary"
206
  METASTASIS = "Metastasis"
@@ -211,6 +334,8 @@ class TileFeatureTensorDataset(Dataset):
211
  self,
212
  site_type: SiteType,
213
  tile_features: np.ndarray,
 
 
214
  n_max_tiles: int = 20000,
215
  ) -> None:
216
  """Initialize the dataset.
@@ -218,12 +343,16 @@ class TileFeatureTensorDataset(Dataset):
218
  Args:
219
  site_type: the site type as str, either "Primary" or "Metastasis"
220
  tile_features: the tile feature array
 
 
221
  n_max_tiles: the maximum number of tiles to use as int
222
 
223
  Returns:
224
  None
225
  """
226
  self.site_type = site_type
 
 
227
  self.n_max_tiles = n_max_tiles
228
  self.features = self._get_features(tile_features)
229
 
@@ -264,7 +393,22 @@ class TileFeatureTensorDataset(Dataset):
264
  Returns:
265
  dict: the item
266
  """
267
- return {
268
  "site": self.site_type.value,
269
  "tile_tensor": self.features
270
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
201
  INT_TO_CANCER_TYPE_MAP = {v: k for k, v in CANCER_TYPE_TO_INT_MAP.items()}
202
 
203
 
204
+ # Tissue site mapping (module-level cache)
205
+ _TISSUE_SITE_MAP = None
206
+
207
+ # Default tissue site index for "Not Applicable"
208
+ DEFAULT_TISSUE_SITE_IDX = 8
209
+
210
+
211
+ def get_tissue_site_map():
212
+ """Load tissue site name → index mapping from CSV.
213
+
214
+ Returns:
215
+ dict: Mapping of tissue site names to indices (0-56)
216
+
217
+ Raises:
218
+ FileNotFoundError: If the tissue site CSV file is not found
219
+ """
220
+ global _TISSUE_SITE_MAP
221
+ if _TISSUE_SITE_MAP is None:
222
+ from pathlib import Path
223
+ import pandas as pd
224
+
225
+ csv_path = Path(__file__).parent.parent.parent.parent / "data" / "tissue_site_original_to_idx.csv"
226
+ try:
227
+ df = pd.read_csv(csv_path)
228
+ except FileNotFoundError as e:
229
+ raise FileNotFoundError(
230
+ f"Tissue site mapping file not found at {csv_path}. "
231
+ f"Please ensure the data directory contains 'tissue_site_original_to_idx.csv'."
232
+ ) from e
233
+
234
+ _TISSUE_SITE_MAP = {}
235
+ for _, row in df.iterrows():
236
+ _TISSUE_SITE_MAP[row['TISSUE_SITE']] = int(row['idx'])
237
+
238
+ return _TISSUE_SITE_MAP
239
+
240
+
241
+ def get_tissue_site_options():
242
+ """Get sorted unique tissue site names for UI dropdowns.
243
+
244
+ Returns:
245
+ list: Sorted list of unique tissue site names
246
+ """
247
+ site_map = get_tissue_site_map()
248
+ return sorted(set(site_map.keys()))
249
+
250
+
251
+ _SEX_MAP = None
252
+
253
+
254
+ def get_sex_map():
255
+ """Get the sex to index mapping from CSV file.
256
+
257
+ Returns:
258
+ dict: Mapping of sex values to indices (0-2)
259
+
260
+ Raises:
261
+ FileNotFoundError: If the sex mapping CSV file is not found
262
+ """
263
+ global _SEX_MAP
264
+ if _SEX_MAP is None:
265
+ from pathlib import Path
266
+ import pandas as pd
267
+
268
+ csv_path = Path(__file__).parent.parent.parent.parent / "data" / "sex_original_to_idx.csv"
269
+ try:
270
+ df = pd.read_csv(csv_path)
271
+ except FileNotFoundError as e:
272
+ raise FileNotFoundError(
273
+ f"Sex mapping file not found at {csv_path}. "
274
+ f"Please ensure the data directory contains 'sex_original_to_idx.csv'."
275
+ ) from e
276
+
277
+ _SEX_MAP = {}
278
+ for _, row in df.iterrows():
279
+ _SEX_MAP[row['SEX']] = int(row['idx'])
280
+
281
+ return _SEX_MAP
282
+
283
+
284
+ def encode_sex(sex):
285
+ """Convert sex to numeric encoding.
286
+
287
+ Args:
288
+ sex: "Male", "Female", or "Unknown" (case insensitive)
289
+
290
+ Returns:
291
+ int: 0 for Male, 1 for Female, 2 for Unknown
292
+ """
293
+ sex_map = get_sex_map()
294
+ unknown_idx = sex_map.get("Unknown", 2)
295
+ return sex_map.get(sex, unknown_idx)
296
+
297
+
298
+ def encode_tissue_site(site_name):
299
+ """Convert tissue site name to index (0-56).
300
+
301
+ Args:
302
+ site_name: Tissue site name from CSV
303
+
304
+ Returns:
305
+ int: Tissue site index, defaults to DEFAULT_TISSUE_SITE_IDX ("Not Applicable")
306
+ """
307
+ site_map = get_tissue_site_map()
308
+ return site_map.get(site_name, DEFAULT_TISSUE_SITE_IDX)
309
+
310
+
311
+ def tissue_site_to_one_hot(site_idx, num_classes=57):
312
+ """Convert tissue site index to one-hot vector.
313
+
314
+ Args:
315
+ site_idx: Index value (0-56 for tissue site, 0-2 for sex)
316
+ num_classes: Number of classes (57 for tissue site, 3 for sex)
317
+
318
+ Returns:
319
+ list: One-hot encoded vector
320
+ """
321
+ one_hot = [0] * num_classes
322
+ if 0 <= site_idx < num_classes:
323
+ one_hot[site_idx] = 1
324
+ return one_hot
325
+
326
+
327
  class SiteType(Enum):
328
  PRIMARY = "Primary"
329
  METASTASIS = "Metastasis"
 
334
  self,
335
  site_type: SiteType,
336
  tile_features: np.ndarray,
337
+ sex: int = None,
338
+ tissue_site_idx: int = None,
339
  n_max_tiles: int = 20000,
340
  ) -> None:
341
  """Initialize the dataset.
 
343
  Args:
344
  site_type: the site type as str, either "Primary" or "Metastasis"
345
  tile_features: the tile feature array
346
+ sex: patient sex (0=Male, 1=Female), optional for Aeon
347
+ tissue_site_idx: tissue site index (0-56), optional for Aeon
348
  n_max_tiles: the maximum number of tiles to use as int
349
 
350
  Returns:
351
  None
352
  """
353
  self.site_type = site_type
354
+ self.sex = sex
355
+ self.tissue_site_idx = tissue_site_idx
356
  self.n_max_tiles = n_max_tiles
357
  self.features = self._get_features(tile_features)
358
 
 
393
  Returns:
394
  dict: the item
395
  """
396
+ result = {
397
  "site": self.site_type.value,
398
  "tile_tensor": self.features
399
  }
400
+
401
+ # Add sex and tissue_site if provided (for Aeon)
402
+ if self.sex is not None:
403
+ result["SEX"] = torch.tensor(
404
+ tissue_site_to_one_hot(self.sex, num_classes=3),
405
+ dtype=torch.float32
406
+ )
407
+
408
+ if self.tissue_site_idx is not None:
409
+ result["TISSUE_SITE"] = torch.tensor(
410
+ tissue_site_to_one_hot(self.tissue_site_idx, num_classes=57),
411
+ dtype=torch.float32
412
+ )
413
+
414
+ return result
src/mosaic/ui/app.py CHANGED
@@ -18,7 +18,9 @@ from mosaic.ui.utils import (
18
  create_user_directory,
19
  load_settings,
20
  validate_settings,
 
21
  IHC_SUBTYPES,
 
22
  SETTINGS_COLUMNS,
23
  )
24
  from mosaic.analysis import analyze_slide
@@ -80,6 +82,8 @@ def analyze_slides(
80
  slides[idx],
81
  row["Segmentation Config"],
82
  row["Site Type"],
 
 
83
  row["Cancer Subtype"],
84
  cancer_subtype_name_map,
85
  row["IHC Subtype"],
@@ -177,6 +181,16 @@ def launch_gradio(server_name, server_port, share):
177
  label="Site Type",
178
  value="Primary",
179
  )
 
 
 
 
 
 
 
 
 
 
180
  cancer_subtype_dropdown = gr.Dropdown(
181
  choices=[name for name in cancer_subtype_name_map.keys()],
182
  label="Cancer Subtype",
@@ -195,15 +209,9 @@ def launch_gradio(server_name, server_port, share):
195
  )
196
  with gr.Row():
197
  settings_input = gr.Dataframe(
198
- headers=[
199
- "Slide",
200
- "Site Type",
201
- "Cancer Subtype",
202
- "IHC Subtype",
203
- "Segmentation Config",
204
- ],
205
  label="Current Settings",
206
- datatype=["str", "str", "str", "str", "str"],
207
  visible=False,
208
  interactive=True,
209
  static_columns="Slide",
@@ -270,7 +278,7 @@ def launch_gradio(server_name, server_port, share):
270
  gr.File(visible=False),
271
  )
272
 
273
- def get_settings(files, site_type, cancer_subtype, ihc_subtype, seg_config):
274
  if files is None:
275
  return pd.DataFrame()
276
  settings = []
@@ -278,7 +286,7 @@ def launch_gradio(server_name, server_port, share):
278
  filename = file.name if hasattr(file, "name") else file
279
  slide_name = filename.split("/")[-1]
280
  settings.append(
281
- [slide_name, site_type, cancer_subtype, ihc_subtype, seg_config]
282
  )
283
  df = pd.DataFrame(settings, columns=SETTINGS_COLUMNS)
284
  return df
@@ -288,6 +296,8 @@ def launch_gradio(server_name, server_port, share):
288
  [
289
  input_slides.change,
290
  site_dropdown.change,
 
 
291
  cancer_subtype_dropdown.change,
292
  ihc_subtype_dropdown.change,
293
  seg_config_dropdown.change,
@@ -295,18 +305,20 @@ def launch_gradio(server_name, server_port, share):
295
  inputs=[
296
  input_slides,
297
  site_dropdown,
 
 
298
  cancer_subtype_dropdown,
299
  ihc_subtype_dropdown,
300
  seg_config_dropdown,
301
  ],
302
  outputs=[settings_input, settings_csv, ihc_subtype_dropdown],
303
  )
304
- def update_settings(files, site_type, cancer_subtype, ihc_subtype, seg_config):
305
  has_ihc = "Breast" in cancer_subtype
306
  if not files:
307
  return None, None, gr.Dropdown(visible=has_ihc)
308
  settings_df = get_settings(
309
- files, site_type, cancer_subtype, ihc_subtype, seg_config
310
  )
311
  if settings_df is not None:
312
  has_ihc = any("Breast" in cs for cs in settings_df["Cancer Subtype"])
 
18
  create_user_directory,
19
  load_settings,
20
  validate_settings,
21
+ get_tissue_sites,
22
  IHC_SUBTYPES,
23
+ SEX_OPTIONS,
24
  SETTINGS_COLUMNS,
25
  )
26
  from mosaic.analysis import analyze_slide
 
82
  slides[idx],
83
  row["Segmentation Config"],
84
  row["Site Type"],
85
+ row["Sex"],
86
+ row["Tissue Site"],
87
  row["Cancer Subtype"],
88
  cancer_subtype_name_map,
89
  row["IHC Subtype"],
 
181
  label="Site Type",
182
  value="Primary",
183
  )
184
+ sex_dropdown = gr.Dropdown(
185
+ choices=SEX_OPTIONS,
186
+ label="Sex",
187
+ value="Unknown",
188
+ )
189
+ tissue_site_dropdown = gr.Dropdown(
190
+ choices=get_tissue_sites(),
191
+ label="Tissue Site",
192
+ value="Unknown",
193
+ )
194
  cancer_subtype_dropdown = gr.Dropdown(
195
  choices=[name for name in cancer_subtype_name_map.keys()],
196
  label="Cancer Subtype",
 
209
  )
210
  with gr.Row():
211
  settings_input = gr.Dataframe(
212
+ headers=SETTINGS_COLUMNS,
 
 
 
 
 
 
213
  label="Current Settings",
214
+ datatype=["str"] * len(SETTINGS_COLUMNS),
215
  visible=False,
216
  interactive=True,
217
  static_columns="Slide",
 
278
  gr.File(visible=False),
279
  )
280
 
281
+ def get_settings(files, site_type, sex, tissue_site, cancer_subtype, ihc_subtype, seg_config):
282
  if files is None:
283
  return pd.DataFrame()
284
  settings = []
 
286
  filename = file.name if hasattr(file, "name") else file
287
  slide_name = filename.split("/")[-1]
288
  settings.append(
289
+ [slide_name, site_type, sex, tissue_site, cancer_subtype, ihc_subtype, seg_config]
290
  )
291
  df = pd.DataFrame(settings, columns=SETTINGS_COLUMNS)
292
  return df
 
296
  [
297
  input_slides.change,
298
  site_dropdown.change,
299
+ sex_dropdown.change,
300
+ tissue_site_dropdown.change,
301
  cancer_subtype_dropdown.change,
302
  ihc_subtype_dropdown.change,
303
  seg_config_dropdown.change,
 
305
  inputs=[
306
  input_slides,
307
  site_dropdown,
308
+ sex_dropdown,
309
+ tissue_site_dropdown,
310
  cancer_subtype_dropdown,
311
  ihc_subtype_dropdown,
312
  seg_config_dropdown,
313
  ],
314
  outputs=[settings_input, settings_csv, ihc_subtype_dropdown],
315
  )
316
+ def update_settings(files, site_type, sex, tissue_site, cancer_subtype, ihc_subtype, seg_config):
317
  has_ihc = "Breast" in cancer_subtype
318
  if not files:
319
  return None, None, gr.Dropdown(visible=has_ihc)
320
  settings_df = get_settings(
321
+ files, site_type, sex, tissue_site, cancer_subtype, ihc_subtype, seg_config
322
  )
323
  if settings_df is not None:
324
  has_ihc = any("Breast" in cs for cs in settings_df["Cancer Subtype"])
src/mosaic/ui/utils.py CHANGED
@@ -17,16 +17,44 @@ import requests
17
  TEMP_USER_DATA_DIR = Path(tempfile.gettempdir()) / "mosaic_user_data"
18
 
19
  IHC_SUBTYPES = ["", "HR+/HER2+", "HR+/HER2-", "HR-/HER2+", "HR-/HER2-"]
 
20
 
21
  SETTINGS_COLUMNS = [
22
  "Slide",
23
  "Site Type",
 
 
24
  "Cancer Subtype",
25
  "IHC Subtype",
26
  "Segmentation Config",
27
  ]
28
 
29
  oncotree_code_map = {}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
 
32
  def get_oncotree_code_name(code):
@@ -98,6 +126,10 @@ def load_settings(slide_csv_path):
98
  settings_df["Cancer Subtype"] = "Unknown"
99
  if "IHC Subtype" not in settings_df.columns:
100
  settings_df["IHC Subtype"] = ""
 
 
 
 
101
  if not set(SETTINGS_COLUMNS).issubset(settings_df.columns):
102
  raise ValueError("Missing required column in CSV file")
103
  settings_df = settings_df[SETTINGS_COLUMNS]
@@ -125,6 +157,8 @@ def validate_settings(settings_df, cancer_subtype_name_map, cancer_subtypes, rev
125
  """
126
  settings_df.columns = SETTINGS_COLUMNS
127
  warnings = []
 
 
128
  for idx, row in settings_df.iterrows():
129
  slide_name = row["Slide"]
130
  subtype = row["Cancer Subtype"]
@@ -142,6 +176,16 @@ def validate_settings(settings_df, cancer_subtype_name_map, cancer_subtypes, rev
142
  f"Slide {slide_name}: Unknown site type. Valid types are: Metastatic, Primary. "
143
  )
144
  settings_df.at[idx, "Site Type"] = "Primary"
 
 
 
 
 
 
 
 
 
 
145
  if (
146
  "Breast" not in settings_df.at[idx, "Cancer Subtype"]
147
  and row["IHC Subtype"] != ""
 
17
  TEMP_USER_DATA_DIR = Path(tempfile.gettempdir()) / "mosaic_user_data"
18
 
19
  IHC_SUBTYPES = ["", "HR+/HER2+", "HR+/HER2-", "HR-/HER2+", "HR-/HER2-"]
20
+ SEX_OPTIONS = ["Unknown", "Male", "Female"]
21
 
22
  SETTINGS_COLUMNS = [
23
  "Slide",
24
  "Site Type",
25
+ "Sex",
26
+ "Tissue Site",
27
  "Cancer Subtype",
28
  "IHC Subtype",
29
  "Segmentation Config",
30
  ]
31
 
32
  oncotree_code_map = {}
33
+ tissue_site_list = None
34
+
35
+
36
+ def get_tissue_sites():
37
+ """Get the list of tissue sites from the tissue site map file.
38
+
39
+ Returns:
40
+ List of tissue site names. Returns ["Unknown"] if the CSV file is not found.
41
+ """
42
+ global tissue_site_list
43
+ if tissue_site_list is None:
44
+ try:
45
+ current_dir = Path(__file__).parent.parent.parent.parent
46
+ tissue_site_map_path = current_dir / "data" / "tissue_site_original_to_idx.csv"
47
+ df = pd.read_csv(tissue_site_map_path)
48
+ # Get unique tissue sites and sort them
49
+ tissue_site_list = ["Unknown"] + sorted(df["TISSUE_SITE"].unique().tolist())
50
+ except FileNotFoundError:
51
+ gr.Warning(
52
+ f"Tissue site mapping file not found at {tissue_site_map_path}. "
53
+ "Only 'Unknown' option will be available for tissue site selection. "
54
+ "Please ensure the data files are downloaded from the model repository."
55
+ )
56
+ tissue_site_list = ["Unknown"]
57
+ return tissue_site_list
58
 
59
 
60
  def get_oncotree_code_name(code):
 
126
  settings_df["Cancer Subtype"] = "Unknown"
127
  if "IHC Subtype" not in settings_df.columns:
128
  settings_df["IHC Subtype"] = ""
129
+ if "Sex" not in settings_df.columns:
130
+ settings_df["Sex"] = "Unknown"
131
+ if "Tissue Site" not in settings_df.columns:
132
+ settings_df["Tissue Site"] = "Unknown"
133
  if not set(SETTINGS_COLUMNS).issubset(settings_df.columns):
134
  raise ValueError("Missing required column in CSV file")
135
  settings_df = settings_df[SETTINGS_COLUMNS]
 
157
  """
158
  settings_df.columns = SETTINGS_COLUMNS
159
  warnings = []
160
+ tissue_sites = get_tissue_sites()
161
+
162
  for idx, row in settings_df.iterrows():
163
  slide_name = row["Slide"]
164
  subtype = row["Cancer Subtype"]
 
176
  f"Slide {slide_name}: Unknown site type. Valid types are: Metastatic, Primary. "
177
  )
178
  settings_df.at[idx, "Site Type"] = "Primary"
179
+ if row["Sex"] not in SEX_OPTIONS:
180
+ warnings.append(
181
+ f"Slide {slide_name}: Unknown sex. Valid options are: {', '.join(SEX_OPTIONS)}. "
182
+ )
183
+ settings_df.at[idx, "Sex"] = "Unknown"
184
+ if row["Tissue Site"] not in tissue_sites:
185
+ warnings.append(
186
+ f"Slide {slide_name}: Unknown tissue site. Valid tissue sites are: {', '.join(tissue_sites)}. "
187
+ )
188
+ settings_df.at[idx, "Tissue Site"] = "Unknown"
189
  if (
190
  "Breast" not in settings_df.at[idx, "Cancer Subtype"]
191
  and row["IHC Subtype"] != ""
test_slides/AEON_TEST_SUMMARY.md ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Aeon Model Test Summary
2
+
3
+ ## Overview
4
+
5
+ This document summarizes the Aeon cancer subtype prediction model testing performed on January 7, 2026.
6
+
7
+ ## Model Information
8
+
9
+ - **Model File**: `aeon_model.pkl` (118MB)
10
+ - **Source**: Exported from `checkpoint.ckpt` (469MB, Nov 29, 2024)
11
+ - **Architecture**: AeonLateAggregator with late fusion
12
+ - **Output Classes**: 160 cancer subtypes
13
+ - **Input Features**:
14
+ - Tile embeddings from Optimus model
15
+ - Sex (one-hot encoded, 3 classes)
16
+ - Tissue site (one-hot encoded, 57 classes)
17
+ - Site type (Primary/Metastatic)
18
+
19
+ ## Test Slides
20
+
21
+ Three test slides were processed using the full Mosaic pipeline with Aeon inference:
22
+
23
+ ### Slide 1: 881837
24
+ - **File**: `881837.svs`
25
+ - **Ground Truth**: BLCA (Bladder Urothelial Carcinoma)
26
+ - **Site Type**: Primary
27
+ - **Sex**: Male
28
+ - **Tissue Site**: Bladder
29
+ - **Prediction**: BLCA
30
+ - **Confidence**: 98.19%
31
+ - **Status**: ✓ PASS
32
+
33
+ ### Slide 2: 744547
34
+ - **File**: `744547.svs`
35
+ - **Ground Truth**: HCC (Hepatocellular Carcinoma)
36
+ - **Site Type**: Metastatic
37
+ - **Sex**: Male
38
+ - **Tissue Site**: Liver
39
+ - **Prediction**: HCC
40
+ - **Confidence**: 99.49%
41
+ - **Status**: ✓ PASS
42
+
43
+ ### Slide 3: 755246
44
+ - **File**: `755246.svs`
45
+ - **Ground Truth**: HCC (Hepatocellular Carcinoma)
46
+ - **Site Type**: Primary
47
+ - **Sex**: Male
48
+ - **Tissue Site**: Liver
49
+ - **Prediction**: HCC
50
+ - **Confidence**: 99.61%
51
+ - **Status**: ✓ PASS
52
+
53
+ ## Test Results
54
+
55
+ | Slide ID | Ground Truth | Prediction | Confidence | Next Highest | Status |
56
+ |----------|--------------|------------|------------|--------------|--------|
57
+ | 881837 | BLCA | BLCA | 98.19% | UTUC (0.87%) | ✓ PASS |
58
+ | 744547 | HCC | HCC | 99.49% | IHCH (0.18%) | ✓ PASS |
59
+ | 755246 | HCC | HCC | 99.61% | IHCH (0.29%) | ✓ PASS |
60
+
61
+ **Overall Accuracy**: 3/3 (100%)
62
+
63
+ ## Pipeline Configuration
64
+
65
+ ### Segmentation
66
+ - **Config**: Biopsy (`SegmentationConfig.BIOPSY`)
67
+ - **Tissue Detection**: Automated segmentation of tissue regions
68
+ - **Tile Size**: 224x224 pixels at 20x magnification
69
+
70
+ ### Feature Extraction
71
+ - **CTransPath**: Pretrained histopathology foundation model
72
+ - **Optimus**: Multi-task feature aggregator
73
+ - **Marker Classifier**: Tissue marker prediction
74
+
75
+ ### Aeon Inference
76
+ - **Model Path**: `data/aeon_model.pkl`
77
+ - **Batch Size**: 8
78
+ - **Workers**: 4
79
+ - **Sex Encoding**: Male=0 (one-hot: [1,0,0])
80
+ - **Tissue Site Encoding**:
81
+ - Bladder=11 (one-hot vector, 57 dims)
82
+ - Liver=26 (one-hot vector, 57 dims)
83
+
84
+ ## Key Implementation Details
85
+
86
+ ### Cancer Type Mapping
87
+ - Mappings loaded from `data/metadata/target_dict.tsv`
88
+ - 160 histologies supported
89
+ - 5 cancer types excluded from predictions: UDMN, ADNOS, CUP, CUPNOS, NOT
90
+
91
+ ### Model Architecture
92
+ ```python
93
+ AeonLateAggregator(
94
+ tile_emb_dim=768,
95
+ num_targets=160,
96
+ sex_embedding_dim=mini_latent_dim (latent_dim // 4),
97
+ tissue_site_embedding_dim=mini_latent_dim,
98
+ site_embedding_dim=mini_latent_dim
99
+ )
100
+ ```
101
+
102
+ ### Encoding Functions
103
+ - **Sex**: `encode_sex(sex_str)` → index (0-2) → one-hot (3 classes)
104
+ - **Tissue Site**: `encode_tissue_site(tissue_site_str)` → index (0-56) → one-hot (57 classes)
105
+
106
+ ## Critical Fixes Applied
107
+
108
+ ### Issue 1: Model-Metadata Mismatch
109
+ - **Problem**: Model outputs 160 classes but code used 183-entry mapping
110
+ - **Solution**: Load mappings from `metadata/target_dict.tsv` instead of global constants
111
+ - **Files Modified**: `src/mosaic/inference/aeon.py` (lines 87-102, 127, 130-147)
112
+
113
+ ### Issue 2: Checkpoint Format
114
+ - **Problem**: Inference code expects `.pkl` files, not PyTorch Lightning `.ckpt` files
115
+ - **Solution**: Exported checkpoint using paladin's `AeonLightningModule.load_from_checkpoint()`
116
+ - **Export Command**: See `scripts/export_aeon_checkpoint.py`
117
+
118
+ ### Issue 3: Missing AeonLateAggregator
119
+ - **Problem**: PyPI paladin package had `AeonAggregator`, not `AeonLateAggregator`
120
+ - **Solution**: Installed paladin from git repo dev branch
121
+ - **Command**: `uv sync --upgrade-package paladin`
122
+
123
+ ## Dependencies
124
+
125
+ ### Critical Packages
126
+ - `paladin` (from git: ssh://git@github.com/pathology-data-mining/paladin.git@dev)
127
+ - `torch>=2.0`
128
+ - `pytorch-lightning`
129
+ - `pandas`
130
+ - `numpy`
131
+
132
+ ### Model Files Required
133
+ - `aeon_model.pkl` (118MB)
134
+ - `metadata/target_dict.tsv`
135
+ - `metadata/n_classes.txt`
136
+ - `metadata/ontology_embedding_dim.txt`
137
+ - `sex_original_to_idx.csv`
138
+ - `tissue_site_original_to_idx.csv`
139
+
140
+ ## Reproducibility
141
+
142
+ All test results are fully reproducible using:
143
+ 1. The test samples defined in `test_samples.json`
144
+ 2. The run script: `scripts/run_aeon_tests.sh`
145
+ 3. The model and metadata uploaded to `PDM-Group/paladin-aeon-models` on Hugging Face
146
+
147
+ ## Output Files
148
+
149
+ For each slide, the following files are generated in `test_slides/results/{slide_id}/`:
150
+ - `{slide_id}_aeon_results.csv` - Full confidence scores for all 160 cancer subtypes
151
+ - `{slide_id}_paladin_results.csv` - Biomarker predictions
152
+ - `{slide_id}_mask.png` - Tissue segmentation mask
153
+ - `{slide_id}_features.h5` - Extracted tile features
154
+
155
+ ## Validation Metrics
156
+
157
+ - **Prediction Accuracy**: 100% (3/3)
158
+ - **Average Confidence**: 99.10%
159
+ - **Minimum Confidence**: 98.19%
160
+ - **Maximum Confidence**: 99.61%
161
+
162
+ ## Hugging Face Repository
163
+
164
+ All model files and metadata have been uploaded to:
165
+ - **Repository**: `PDM-Group/paladin-aeon-models`
166
+ - **URL**: https://huggingface.co/PDM-Group/paladin-aeon-models
167
+
168
+ ### Uploaded Files (Jan 7, 2026)
169
+ - `aeon_model.pkl` (118MB)
170
+ - `metadata/` (5 files)
171
+ - `sex_original_to_idx.csv`
172
+ - `tissue_site_original_to_idx.csv`
173
+
174
+ ## Test Date
175
+
176
+ - **Date**: January 7, 2026
177
+ - **Git Commit**: 49fbf68 (Complete implementation of sex and tissue site parameters)
178
+ - **Tester**: Ray Lim
test_slides/results/744547/744547_aeon_results.csv ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Cancer Subtype,Confidence
2
+ HCC,0.9949353337287903
3
+ IHCH,0.0018046980258077383
4
+ ACC,0.000752770050894469
5
+ OPHSC,0.00036126983468420804
6
+ PAAC,0.0003446016926318407
7
+ THHC,0.0001074946703738533
8
+ NPC,9.898660937324166e-05
9
+ HNSC,8.415944466833025e-05
10
+ STAD,8.021725079743192e-05
11
+ HGNEC,7.314849062822759e-05
12
+ WT,6.0233636759221554e-05
13
+ ANSC,5.0760227168211713e-05
14
+ NSGCT,4.1861039790092036e-05
15
+ CHRCC,3.653292151284404e-05
16
+ EHCH,3.409649070817977e-05
17
+ PEMESO,3.360131086083129e-05
18
+ SFT,3.2356434530811384e-05
19
+ PANET,3.2246403861790895e-05
20
+ ANGS,3.0679162591695786e-05
21
+ PAAD,2.974703056679573e-05
22
+ EGC,2.666616092028562e-05
23
+ UM,2.5927740352926776e-05
24
+ MCC,2.5807627025642432e-05
25
+ SKCM,2.2870222892379388e-05
26
+ ODG,2.1314301193342544e-05
27
+ COAD,2.0228028006386012e-05
28
+ SCLC,2.006701470236294e-05
29
+ PTAD,1.9878523744409904e-05
30
+ SSRCC,1.9336708646733314e-05
31
+ PECOMA,1.8762702893582173e-05
32
+ SBOV,1.75385684997309e-05
33
+ PRCC,1.7517071682959795e-05
34
+ EMPD,1.742972199281212e-05
35
+ LMS,1.7376194591633976e-05
36
+ ES,1.7088817912735976e-05
37
+ SBWDNET,1.6885298464330845e-05
38
+ READ,1.6691648852429353e-05
39
+ ASTR,1.5237395928124897e-05
40
+ MACR,1.468232130719116e-05
41
+ ARMM,1.4290351828094572e-05
42
+ CSCC,1.4241238204704132e-05
43
+ LUSC,1.3981271877128165e-05
44
+ THPA,1.3920045603299513e-05
45
+ CCOV,1.389113640470896e-05
46
+ PRAD,1.3490677702066023e-05
47
+ BLCA,1.2866928955190815e-05
48
+ GIST,1.2835040251957253e-05
49
+ BCC,1.2371066986816004e-05
50
+ THPD,1.229105419042753e-05
51
+ BMGCT,1.1911726687685587e-05
52
+ DES,1.1774703125411179e-05
53
+ LGSOC,1.158510258392198e-05
54
+ UTUC,1.1252666809014045e-05
55
+ PAMPCA,1.1209620424779132e-05
56
+ ACYC,1.0676962119759992e-05
57
+ THYC,1.063129002432106e-05
58
+ ULMS,1.0360988198954146e-05
59
+ WDLS,9.846420653047971e-06
60
+ DA,9.834530828811694e-06
61
+ MPNST,9.739153028931469e-06
62
+ HNMUCM,9.407535799255129e-06
63
+ THAP,9.405741366208531e-06
64
+ OCS,9.383109500049613e-06
65
+ GBAD,9.255892109649722e-06
66
+ GCCAP,9.183406291413121e-06
67
+ SDCA,9.130208127317019e-06
68
+ EPIS,8.989960406324826e-06
69
+ PHC,8.885029274097178e-06
70
+ EHAE,8.772132787271403e-06
71
+ PLMESO,8.688036359671969e-06
72
+ ESCC,8.65026049723383e-06
73
+ TAC,8.619595064374153e-06
74
+ GRCT,8.587684533267748e-06
75
+ BLAD,8.564702511648647e-06
76
+ DSRCT,8.161241566995159e-06
77
+ EPM,7.880689736339264e-06
78
+ MFH,7.607672614540206e-06
79
+ SCBC,7.4185886660416145e-06
80
+ SEM,7.19011586625129e-06
81
+ SYNS,7.085985998855904e-06
82
+ UCP,6.969770311116008e-06
83
+ UEC,6.914885034348117e-06
84
+ LUCA,6.88146519678412e-06
85
+ GEJ,6.397517154255183e-06
86
+ ALUCA,6.210095307324082e-06
87
+ CHDM,6.182817287481157e-06
88
+ OS,6.126435437181499e-06
89
+ MAAP,6.075235432945192e-06
90
+ LUPC,6.040307198418304e-06
91
+ ESCA,5.798766324005555e-06
92
+ ERMS,5.519645128515549e-06
93
+ RBL,5.355142548069125e-06
94
+ VSC,5.320628133631544e-06
95
+ DDLS,5.237710411165608e-06
96
+ CCRCC,4.890111540589714e-06
97
+ ARMS,4.8574038373772055e-06
98
+ MNG,4.79665777675109e-06
99
+ HGSOC,4.588881893141661e-06
100
+ THYM,4.543203431239817e-06
101
+ BA,4.302230991015676e-06
102
+ NBL,4.251941845723195e-06
103
+ UCCC,4.113941486139083e-06
104
+ GBM,3.829367415164597e-06
105
+ EOV,3.7675256407965207e-06
106
+ CHS,3.7070362850499805e-06
107
+ IDC,2.831375240930356e-06
108
+ MBC,2.820524969138205e-06
109
+ DASTR,2.7697560653905384e-06
110
+ UCS,2.7230109935771907e-06
111
+ CESC,2.5985430056607584e-06
112
+ VMM,2.466509386067628e-06
113
+ ILC,2.451496357025462e-06
114
+ LUNE,2.4507951366103953e-06
115
+ ATM,2.3741329187032534e-06
116
+ MRLS,2.351298462599516e-06
117
+ THME,2.2384033400157932e-06
118
+ MOV,2.2177562186698196e-06
119
+ ECAD,2.15100362765952e-06
120
+ LUAD,1.7751663108356297e-06
121
+ ACRM,1.5891558859948418e-06
122
+ MFS,1.58587818077649e-06
123
+ PAST,1.1561178325791843e-06
124
+ USC,9.618892136131763e-07
125
+ SCHW,9.386900501340278e-07
126
+ NECNOS,6.424317007258651e-07
127
+ BRCANOS,4.31851503890357e-07
128
+ MDLC,4.30085862035412e-07
129
+ SCCNOS,3.5081365012956667e-07
130
+ SBC,3.1285472346098686e-07
131
+ NSCLC,2.687320375116542e-07
132
+ MXOV,2.395507863184321e-07
133
+ SARCNOS,2.364027835710658e-07
134
+ MEL,2.3354837708211562e-07
135
+ CHOL,2.3344369992628344e-07
136
+ PAASC,2.256272324530073e-07
137
+ MUP,2.2071883165608597e-07
138
+ BRCA,2.1710592079671187e-07
139
+ NVRINT,2.1676953565474832e-07
140
+ AMPCA,2.1609788802834373e-07
141
+ LUAS,2.136845722588987e-07
142
+ URCC,2.0894131580462272e-07
143
+ BRCNOS,2.067983615461344e-07
144
+ GINET,2.0037792580751557e-07
145
+ PDC,1.7432496690616972e-07
146
+ GNOS,1.6309185468799114e-07
147
+ NETNOS,1.6066735497588525e-07
148
+ APAD,1.5840650746667961e-07
149
+ DIFG,1.5840393530197616e-07
150
+ COADREAD,1.5738932290787488e-07
151
+ CSCLC,1.4587288887923933e-07
152
+ RCC,1.3664508458077762e-07
153
+ UMEC,1.1690717371948267e-07
154
+ GBC,8.854171795746879e-08
155
+ NSCLCPD,7.288439718422524e-08
156
+ UCEC,7.106321220362588e-08
157
+ ADNOS,0.0
158
+ CUP,0.0
159
+ CUPNOS,0.0
160
+ NOT,0.0
161
+ UDMN,0.0
test_slides/results/744547/744547_paladin_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Cancer Subtype,Biomarker,Score
2
+ HCC,Del_8p,0.2197096198797226
test_slides/results/755246/755246_aeon_results.csv ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Cancer Subtype,Confidence
2
+ HCC,0.9960950016975403
3
+ IHCH,0.0028622548561543226
4
+ PAAC,0.0002748313418123871
5
+ OPHSC,4.615329817170277e-05
6
+ ACC,3.686468699015677e-05
7
+ NPC,3.5695637052413076e-05
8
+ HGNEC,2.805266558425501e-05
9
+ PAMPCA,2.4401044356636703e-05
10
+ HNSC,2.1414962247945368e-05
11
+ WT,2.062890780507587e-05
12
+ STAD,1.9500384951243177e-05
13
+ ANSC,1.9137134586344473e-05
14
+ NSGCT,1.8425340385874733e-05
15
+ ACYC,1.6153164324350655e-05
16
+ SFT,1.4236396054911893e-05
17
+ PAAD,1.385363702866016e-05
18
+ ANGS,1.3724854397878516e-05
19
+ CSCC,1.1247223483223934e-05
20
+ CHRCC,1.1185951734660193e-05
21
+ MCC,1.1107679711130913e-05
22
+ PEMESO,1.0368969924456906e-05
23
+ LUSC,1.0365045454818755e-05
24
+ BMGCT,1.0343398571421858e-05
25
+ LGSOC,1.0238399227091577e-05
26
+ COAD,1.0101895895786583e-05
27
+ PECOMA,9.616783245292027e-06
28
+ ODG,9.323048288933933e-06
29
+ READ,8.963126674643718e-06
30
+ GBAD,8.933258868637495e-06
31
+ ASTR,8.659791092213709e-06
32
+ EMPD,8.184463695215527e-06
33
+ CCOV,7.98109067545738e-06
34
+ SSRCC,7.932881999295205e-06
35
+ LMS,7.707203621976078e-06
36
+ SDCA,7.481134616682539e-06
37
+ EGC,7.304894097615033e-06
38
+ EHCH,7.18998262527748e-06
39
+ DA,7.112325420166599e-06
40
+ GRCT,7.074557743180776e-06
41
+ THYC,6.8966392063885e-06
42
+ ES,6.499152732430957e-06
43
+ TAC,6.04554270466906e-06
44
+ LUPC,5.894301011721836e-06
45
+ UM,5.811031769553665e-06
46
+ HNMUCM,5.70150587009266e-06
47
+ MPNST,5.589005922956858e-06
48
+ PTAD,5.411789061326999e-06
49
+ THHC,5.279471679386916e-06
50
+ SBOV,5.134588718647137e-06
51
+ EHAE,5.056334430264542e-06
52
+ UCP,4.79583195556188e-06
53
+ PANET,4.769396582560148e-06
54
+ UEC,4.691957201430341e-06
55
+ MFH,4.5436950131261256e-06
56
+ DES,4.397928478283575e-06
57
+ MACR,4.386647560750134e-06
58
+ DSRCT,4.259570687281666e-06
59
+ BCC,4.2207666410831735e-06
60
+ THYM,4.200107468932401e-06
61
+ SKCM,4.163767698628362e-06
62
+ SCLC,4.137152245675679e-06
63
+ SBWDNET,4.107126642338699e-06
64
+ BLCA,4.042853561259108e-06
65
+ THPA,3.967577868024819e-06
66
+ CHDM,3.964094503317028e-06
67
+ EPIS,3.7835466173419263e-06
68
+ UTUC,3.6972760426579043e-06
69
+ PRCC,3.5816708532365737e-06
70
+ RBL,3.18466845783405e-06
71
+ PHC,3.1321446840593126e-06
72
+ ARMS,3.073702600886463e-06
73
+ WDLS,2.967372893181164e-06
74
+ HGSOC,2.9246689337014686e-06
75
+ EOV,2.842233243427472e-06
76
+ PLMESO,2.8313088478171267e-06
77
+ LUCA,2.7812454845843604e-06
78
+ ERMS,2.6738305223261705e-06
79
+ ARMM,2.653551291587064e-06
80
+ ESCC,2.534464101699996e-06
81
+ GIST,2.4675975964782992e-06
82
+ OS,2.3962300019775284e-06
83
+ SEM,2.267454874527175e-06
84
+ UCCC,2.2549797904503066e-06
85
+ GEJ,2.228342964372132e-06
86
+ ALUCA,2.21250593313016e-06
87
+ CCRCC,2.1547989490500186e-06
88
+ GCCAP,2.145640792150516e-06
89
+ CHS,2.1370740341808414e-06
90
+ EPM,2.113296659445041e-06
91
+ LUNE,1.8964744867844274e-06
92
+ THAP,1.8861140915760188e-06
93
+ MNG,1.883976551653177e-06
94
+ SCBC,1.8645082491275389e-06
95
+ GBM,1.8083156874126871e-06
96
+ THME,1.7755396584107075e-06
97
+ SYNS,1.749014700180851e-06
98
+ VSC,1.7275091295232414e-06
99
+ PRAD,1.7234281131095486e-06
100
+ THPD,1.5961649069140549e-06
101
+ OCS,1.574517796143482e-06
102
+ ULMS,1.5738452248115209e-06
103
+ ESCA,1.5239124877552968e-06
104
+ LUAD,1.5225077731884085e-06
105
+ ATM,1.4334439129015664e-06
106
+ DASTR,1.3374641412156052e-06
107
+ BLAD,1.3204950164436013e-06
108
+ MRLS,1.2621910627785837e-06
109
+ PAST,1.214046847053396e-06
110
+ DDLS,1.207565446748049e-06
111
+ BA,1.193913703900762e-06
112
+ USC,1.1651327440631576e-06
113
+ UCS,1.127504333453544e-06
114
+ IDC,1.1131320434287773e-06
115
+ MBC,1.073736029866268e-06
116
+ CESC,1.0562247325651697e-06
117
+ ECAD,1.0221086768069654e-06
118
+ MOV,1.0173221198783722e-06
119
+ MFS,8.068860779530951e-07
120
+ ILC,7.115056064321834e-07
121
+ NBL,7.080777209012012e-07
122
+ MAAP,7.038765375000366e-07
123
+ SCHW,6.599229323001055e-07
124
+ VMM,6.425576088986418e-07
125
+ ACRM,5.137104039931728e-07
126
+ SCCNOS,2.88556179839361e-07
127
+ NECNOS,2.860723498088191e-07
128
+ MDLC,1.5423735533204308e-07
129
+ SBC,1.473414386055083e-07
130
+ LUAS,1.313886741627357e-07
131
+ AMPCA,1.1514853781591228e-07
132
+ BRCANOS,1.100927349284575e-07
133
+ COADREAD,1.0235498848487623e-07
134
+ BRCNOS,1.0230521496623624e-07
135
+ NSCLC,9.84467831699476e-08
136
+ URCC,9.794707978016959e-08
137
+ SARCNOS,9.708332271429754e-08
138
+ MEL,9.660030997338254e-08
139
+ MXOV,9.167136028054301e-08
140
+ BRCA,9.164460124111429e-08
141
+ GINET,9.150922863909727e-08
142
+ MUP,8.768505921352698e-08
143
+ GNOS,8.663077721848822e-08
144
+ CSCLC,8.40352782915943e-08
145
+ PAASC,8.390650663159249e-08
146
+ CHOL,8.387851124780354e-08
147
+ NVRINT,7.996587214620376e-08
148
+ PDC,7.627340892213397e-08
149
+ APAD,7.168080173869384e-08
150
+ NETNOS,5.965442539945798e-08
151
+ RCC,5.8925582635538376e-08
152
+ DIFG,5.8173132089223145e-08
153
+ UMEC,4.262162889290266e-08
154
+ GBC,3.4216750322002554e-08
155
+ NSCLCPD,2.902431717188847e-08
156
+ UCEC,2.4343131954651653e-08
157
+ ADNOS,0.0
158
+ CUP,0.0
159
+ CUPNOS,0.0
160
+ NOT,0.0
161
+ UDMN,0.0
test_slides/results/755246/755246_paladin_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Cancer Subtype,Biomarker,Score
2
+ HCC,Del_8p,0.5229867100715637
test_slides/results/881837/881837_aeon_results.csv ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Cancer Subtype,Confidence
2
+ BLCA,0.9819191694259644
3
+ UTUC,0.008712546899914742
4
+ ESCC,0.0018056846456602216
5
+ HNSC,0.0012019714340567589
6
+ SCBC,0.0011546163586899638
7
+ CSCC,0.0009968428639695048
8
+ BLAD,0.000923480314668268
9
+ VSC,0.0003943268384318799
10
+ LUSC,0.0002285304362885654
11
+ MFH,0.0002151085063815117
12
+ CESC,0.00016243607387878
13
+ ANSC,0.00014434086915571243
14
+ OPHSC,0.00010440379264764488
15
+ ERMS,9.818207036005333e-05
16
+ ESCA,8.05684321676381e-05
17
+ THME,7.75650842115283e-05
18
+ DDLS,7.418631867039949e-05
19
+ EMPD,6.874153041280806e-05
20
+ SKCM,6.219839997356758e-05
21
+ STAD,5.7987392210634425e-05
22
+ ARMM,5.323389996192418e-05
23
+ VMM,4.82685245515313e-05
24
+ ANGS,4.764321420225315e-05
25
+ HNMUCM,4.300122964195907e-05
26
+ MPNST,4.062237712787464e-05
27
+ THAP,3.992616620962508e-05
28
+ CCOV,3.978966560680419e-05
29
+ EGC,3.804164953180589e-05
30
+ PECOMA,3.58297438651789e-05
31
+ GEJ,3.361854032846168e-05
32
+ LMS,3.190735151292756e-05
33
+ GIST,3.053894397453405e-05
34
+ UCP,2.6960187824442983e-05
35
+ BCC,2.6275198251823895e-05
36
+ WDLS,2.504539406800177e-05
37
+ COAD,2.4500428480678238e-05
38
+ READ,2.3131282432586886e-05
39
+ PLMESO,2.2688231183565222e-05
40
+ PHC,2.2610343876294792e-05
41
+ HGSOC,2.2057527530705556e-05
42
+ EHCH,2.0888484868919477e-05
43
+ SFT,2.061663690255955e-05
44
+ LUPC,2.0348075850051828e-05
45
+ ARMS,2.0138555555604398e-05
46
+ NPC,2.0069532183697447e-05
47
+ SDCA,1.9388126020203345e-05
48
+ PRAD,1.885020174086094e-05
49
+ PAMPCA,1.8754808479570784e-05
50
+ HGNEC,1.862645149230957e-05
51
+ DA,1.7162436051876284e-05
52
+ EPM,1.6749159840401262e-05
53
+ PTAD,1.6459982361993752e-05
54
+ NSGCT,1.5192160390142817e-05
55
+ UM,1.4609363461204339e-05
56
+ SSRCC,1.4532034583680797e-05
57
+ SEM,1.3767842574452516e-05
58
+ GCCAP,1.3716285138798412e-05
59
+ OS,1.3539702194975689e-05
60
+ LUCA,1.3414683053269982e-05
61
+ BA,1.340845392405754e-05
62
+ ODG,1.3301939361554105e-05
63
+ ULMS,1.3188847333367448e-05
64
+ ILC,1.3158909496269189e-05
65
+ GBAD,1.3059830052952748e-05
66
+ HCC,1.2799720934708603e-05
67
+ THYC,1.2515503840404563e-05
68
+ ACC,1.2373102435958572e-05
69
+ EHAE,1.2079371117579285e-05
70
+ THPD,1.2004609743598849e-05
71
+ LUAD,1.1745101801352575e-05
72
+ IHCH,1.1457958862592932e-05
73
+ BMGCT,1.14437843876658e-05
74
+ DSRCT,1.1421912859077565e-05
75
+ ASTR,1.1173871826031245e-05
76
+ THPA,1.0899780136242043e-05
77
+ DES,1.0811750144057442e-05
78
+ MNG,1.0468449545442127e-05
79
+ PEMESO,1.0408997695776634e-05
80
+ ES,1.0323257811251096e-05
81
+ SCHW,1.024130961013725e-05
82
+ ACRM,9.928556210070383e-06
83
+ RBL,9.5573650469305e-06
84
+ UCS,9.554930329613853e-06
85
+ CHDM,9.308922926720697e-06
86
+ NBL,9.223444067174569e-06
87
+ OCS,9.14371048565954e-06
88
+ MBC,8.864662049745675e-06
89
+ EPIS,8.839479050948285e-06
90
+ LUNE,7.987197932379786e-06
91
+ MFS,7.876495146774687e-06
92
+ GRCT,7.63608932174975e-06
93
+ UCCC,7.5786451816384215e-06
94
+ SBOV,7.530676612077514e-06
95
+ LGSOC,7.442437436111504e-06
96
+ IDC,7.195951184257865e-06
97
+ ATM,7.1556032708031125e-06
98
+ CCRCC,6.893693353049457e-06
99
+ PAAC,6.823243893450126e-06
100
+ TAC,6.447641681006644e-06
101
+ PRCC,6.3316215346276294e-06
102
+ THHC,6.0560396377695724e-06
103
+ ACYC,5.6987450989254285e-06
104
+ SYNS,5.622236585622886e-06
105
+ EOV,5.584717200690648e-06
106
+ PAAD,5.3922740335110575e-06
107
+ ALUCA,5.391060767578892e-06
108
+ SBWDNET,5.067640813649632e-06
109
+ CHRCC,5.057868747826433e-06
110
+ MOV,5.055631390860071e-06
111
+ MACR,4.9109949031844735e-06
112
+ USC,4.86157659906894e-06
113
+ THYM,4.622621872840682e-06
114
+ MRLS,3.963573362852912e-06
115
+ MCC,3.7632978546753293e-06
116
+ GBM,3.364578788023209e-06
117
+ DASTR,3.190721145074349e-06
118
+ WT,3.0390583560802042e-06
119
+ PAST,3.0256139780249214e-06
120
+ ECAD,2.9308980629139114e-06
121
+ UEC,2.8302536065893946e-06
122
+ SCLC,2.7639291602099547e-06
123
+ CHS,2.524923047531047e-06
124
+ PANET,1.8923717561847297e-06
125
+ MAAP,1.4423669654206606e-06
126
+ COADREAD,4.742931878354284e-07
127
+ NECNOS,3.882300347868295e-07
128
+ SCCNOS,3.831314643321093e-07
129
+ SARCNOS,3.6490285992840654e-07
130
+ CSCLC,3.121313341125642e-07
131
+ AMPCA,3.0839311193631147e-07
132
+ BRCANOS,3.0359336733454256e-07
133
+ MDLC,3.011147668985359e-07
134
+ LUAS,2.772100344827777e-07
135
+ MXOV,2.492066641934798e-07
136
+ UCEC,2.3689378281233076e-07
137
+ BRCA,2.2784350051097135e-07
138
+ PDC,2.1551392137553194e-07
139
+ PAASC,2.1430530239285872e-07
140
+ NETNOS,2.1316037646101904e-07
141
+ SBC,2.033053903005566e-07
142
+ NVRINT,1.984174957669893e-07
143
+ URCC,1.8473356533377228e-07
144
+ BRCNOS,1.7265826102175197e-07
145
+ NSCLC,1.700868494936003e-07
146
+ GBC,1.6887882736682513e-07
147
+ GINET,1.6104266364891373e-07
148
+ APAD,1.5570284972454829e-07
149
+ GNOS,1.547322483475e-07
150
+ CHOL,1.4654114011136699e-07
151
+ DIFG,1.4512880852635135e-07
152
+ MUP,1.4281692983786343e-07
153
+ MEL,1.391822195273562e-07
154
+ NSCLCPD,1.0773995029467187e-07
155
+ RCC,9.598431205404268e-08
156
+ UMEC,6.501737459529977e-08
157
+ ADNOS,0.0
158
+ CUP,0.0
159
+ CUPNOS,0.0
160
+ NOT,0.0
161
+ UDMN,0.0
test_slides/results/881837/881837_paladin_results.csv ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ Cancer Subtype,Biomarker,Score
2
+ BLCA,Del_6q,0.14185988903045654
3
+ BLCA,FGFR3_ONCOGENIC,0.08551423251628876
4
+ BLCA,RB1_ONCOGENIC,0.10338973999023438
5
+ BLCA,RB1_TRUNC,0.08562182635068893
6
+ BLCA,TP53_PATHWAY,0.8390844464302063
test_slides/test_samples.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "sample_id": "P-0000034-T01-IM3",
4
+ "image_id": "881837",
5
+ "sex": "MALE",
6
+ "tissue_site": "Bladder",
7
+ "site_type": "Primary",
8
+ "cancer_type": "BLCA",
9
+ "confidence": 0.9412469863891602
10
+ },
11
+ {
12
+ "sample_id": "P-0000037-T01-IM3",
13
+ "image_id": "744547",
14
+ "sex": "Male",
15
+ "tissue_site": "Liver",
16
+ "site_type": "Metastasis",
17
+ "cancer_type": "HCC",
18
+ "confidence": 0.9471913576126099
19
+ },
20
+ {
21
+ "sample_id": "P-0000037-T02-IM3",
22
+ "image_id": "755246",
23
+ "sex": "Male",
24
+ "tissue_site": "Liver",
25
+ "site_type": "Primary",
26
+ "cancer_type": "HCC",
27
+ "confidence": 0.9306515455245972
28
+ }
29
+ ]
test_slides/verification_report.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "total": 3,
3
+ "passed": 3,
4
+ "failed": 0,
5
+ "accuracy": 1.0,
6
+ "results": [
7
+ {
8
+ "slide_id": "881837",
9
+ "ground_truth": "BLCA",
10
+ "predicted": "BLCA",
11
+ "confidence": 0.9819191694259644,
12
+ "site_type": "Primary",
13
+ "sex": "MALE",
14
+ "tissue_site": "Bladder",
15
+ "status": "PASS"
16
+ },
17
+ {
18
+ "slide_id": "744547",
19
+ "ground_truth": "HCC",
20
+ "predicted": "HCC",
21
+ "confidence": 0.9949353337287904,
22
+ "site_type": "Metastasis",
23
+ "sex": "Male",
24
+ "tissue_site": "Liver",
25
+ "status": "PASS"
26
+ },
27
+ {
28
+ "slide_id": "755246",
29
+ "ground_truth": "HCC",
30
+ "predicted": "HCC",
31
+ "confidence": 0.9960950016975404,
32
+ "site_type": "Primary",
33
+ "sex": "Male",
34
+ "tissue_site": "Liver",
35
+ "status": "PASS"
36
+ }
37
+ ]
38
+ }
tests/inference/test_aeon.py CHANGED
@@ -4,43 +4,43 @@ import numpy as np
4
  import pytest
5
  import torch
6
 
7
- from mosaic.inference.aeon import (
 
8
  CANCER_TYPE_TO_INT_MAP,
9
  INT_TO_CANCER_TYPE_MAP,
10
- col_indices_to_drop,
11
  )
12
 
13
 
14
  class TestAeonConstants:
15
  """Test constants defined in aeon module."""
16
 
17
- def test_col_indices_to_drop_is_list(self):
18
- """Test that col_indices_to_drop is a list."""
19
- assert isinstance(col_indices_to_drop, list)
20
 
21
- def test_col_indices_to_drop_has_entries(self):
22
- """Test that col_indices_to_drop has entries."""
23
- assert len(col_indices_to_drop) > 0
24
 
25
- def test_col_indices_to_drop_are_integers(self):
26
- """Test that all indices are integers."""
27
- for idx in col_indices_to_drop:
28
- assert isinstance(idx, int)
29
 
30
- def test_col_indices_to_drop_are_valid(self):
31
- """Test that all indices are valid cancer type indices."""
32
- max_idx = max(CANCER_TYPE_TO_INT_MAP.values())
33
- for idx in col_indices_to_drop:
34
- assert 0 <= idx <= max_idx
 
 
35
 
36
- def test_col_indices_to_drop_contains_expected_types(self):
37
  """Test that specific cancer types are in the drop list."""
38
  # Check that some known cancer types to drop are in the list
39
- drop_types = ["UDMN", "CUP", "BRCA", "MEL"]
40
- for cancer_type in drop_types:
41
- if cancer_type in CANCER_TYPE_TO_INT_MAP:
42
- idx = CANCER_TYPE_TO_INT_MAP[cancer_type]
43
- assert idx in col_indices_to_drop
44
 
45
  def test_cancer_type_maps_available(self):
46
  """Test that cancer type maps are available."""
 
4
  import pytest
5
  import torch
6
 
7
+ from mosaic.inference.aeon import CANCER_TYPES_TO_DROP
8
+ from mosaic.inference.data import (
9
  CANCER_TYPE_TO_INT_MAP,
10
  INT_TO_CANCER_TYPE_MAP,
 
11
  )
12
 
13
 
14
  class TestAeonConstants:
15
  """Test constants defined in aeon module."""
16
 
17
+ def test_cancer_types_to_drop_is_list(self):
18
+ """Test that CANCER_TYPES_TO_DROP is a list."""
19
+ assert isinstance(CANCER_TYPES_TO_DROP, list)
20
 
21
+ def test_cancer_types_to_drop_has_entries(self):
22
+ """Test that CANCER_TYPES_TO_DROP has entries."""
23
+ assert len(CANCER_TYPES_TO_DROP) > 0
24
 
25
+ def test_cancer_types_to_drop_are_strings(self):
26
+ """Test that all cancer types are strings."""
27
+ for cancer_type in CANCER_TYPES_TO_DROP:
28
+ assert isinstance(cancer_type, str)
29
 
30
+ def test_cancer_types_to_drop_are_valid(self):
31
+ """Test that all cancer types to drop are valid cancer type codes."""
32
+ # They should all be uppercase alphanumeric codes
33
+ for cancer_type in CANCER_TYPES_TO_DROP:
34
+ assert cancer_type.isupper()
35
+ assert len(cancer_type) >= 2
36
+ assert len(cancer_type) <= 10
37
 
38
+ def test_cancer_types_to_drop_contains_expected_types(self):
39
  """Test that specific cancer types are in the drop list."""
40
  # Check that some known cancer types to drop are in the list
41
+ expected_types = ["UDMN", "CUP", "NOT"]
42
+ for cancer_type in expected_types:
43
+ assert cancer_type in CANCER_TYPES_TO_DROP
 
 
44
 
45
  def test_cancer_type_maps_available(self):
46
  """Test that cancer type maps are available."""
uv.lock CHANGED
@@ -2336,6 +2336,40 @@ wheels = [
2336
  { url = "https://files.pythonhosted.org/packages/ef/70/a07dcf4f62598c8ad579df241af55ced65bed76e42e45d3c368a6d82dbc1/kombu-5.5.4-py3-none-any.whl", hash = "sha256:a12ed0557c238897d8e518f1d1fdf84bd1516c5e305af2dacd85c2015115feb8", size = 210034, upload-time = "2025-06-01T10:19:20.436Z" },
2337
  ]
2338
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2339
  [[package]]
2340
  name = "loguru"
2341
  version = "0.7.3"
@@ -2565,11 +2599,14 @@ version = "0.1.0"
2565
  source = { editable = "." }
2566
  dependencies = [
2567
  { name = "gradio" },
 
2568
  { name = "loguru" },
2569
  { name = "memory-profiler" },
2570
  { name = "mussel", extra = ["torch-gpu"] },
2571
  { name = "paladin" },
 
2572
  { name = "spaces" },
 
2573
  ]
2574
 
2575
  [package.dev-dependencies]
@@ -2584,11 +2621,14 @@ dev = [
2584
  [package.metadata]
2585
  requires-dist = [
2586
  { name = "gradio", specifier = ">=5.49.0" },
 
2587
  { name = "loguru", specifier = ">=0.7.3" },
2588
  { name = "memory-profiler", specifier = ">=0.61.0" },
2589
  { name = "mussel", extras = ["torch-gpu"], git = "https://github.com/pathology-data-mining/Mussel.git?rev=mosaic-dev" },
2590
  { name = "paladin", git = "ssh://git@github.com/pathology-data-mining/paladin.git?rev=dev" },
 
2591
  { name = "spaces", specifier = ">=0.30.0" },
 
2592
  ]
2593
 
2594
  [package.metadata.requires-dev]
@@ -3199,8 +3239,8 @@ wheels = [
3199
 
3200
  [[package]]
3201
  name = "paladin"
3202
- version = "0.1.dev164+g0aef7cad1"
3203
- source = { git = "ssh://git@github.com/pathology-data-mining/paladin.git?rev=dev#0aef7cad1c2c4b54ea75406e3ed4e61c83591a71" }
3204
  dependencies = [
3205
  { name = "dvc" },
3206
  { name = "nn-template-core" },
@@ -3280,6 +3320,18 @@ wheels = [
3280
  { url = "https://files.pythonhosted.org/packages/cc/20/ff623b09d963f88bfde16306a54e12ee5ea43e9b597108672ff3a408aad6/pathspec-0.12.1-py3-none-any.whl", hash = "sha256:a0d503e138a4c123b27490a4f7beda6a01c6f288df0e4a8b79c7eb0dc7b4cc08", size = 31191, upload-time = "2023-12-10T22:30:43.14Z" },
3281
  ]
3282
 
 
 
 
 
 
 
 
 
 
 
 
 
3283
  [[package]]
3284
  name = "pillow"
3285
  version = "11.3.0"
@@ -4846,6 +4898,20 @@ wheels = [
4846
  { url = "https://files.pythonhosted.org/packages/5c/2e/10b7fe92ddc69e5aae177775a3c8ed890bdd6cb40c2aa04e0a982937edd1/scmrepo-3.5.2-py3-none-any.whl", hash = "sha256:6e4660572b76512d0e013ca9806692188c736e8c9c76f833e3674fc21a558788", size = 73868, upload-time = "2025-08-06T14:46:31.635Z" },
4847
  ]
4848
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4849
  [[package]]
4850
  name = "semantic-version"
4851
  version = "2.10.0"
@@ -5052,6 +5118,52 @@ wheels = [
5052
  { url = "https://files.pythonhosted.org/packages/be/72/2db2f49247d0a18b4f1bb9a5a39a0162869acf235f3a96418363947b3d46/starlette-0.48.0-py3-none-any.whl", hash = "sha256:0764ca97b097582558ecb498132ed0c7d942f233f365b86ba37770e026510659", size = 73736, upload-time = "2025-09-13T08:41:03.869Z" },
5053
  ]
5054
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5055
  [[package]]
5056
  name = "stqdm"
5057
  version = "0.0.5"
 
2336
  { url = "https://files.pythonhosted.org/packages/ef/70/a07dcf4f62598c8ad579df241af55ced65bed76e42e45d3c368a6d82dbc1/kombu-5.5.4-py3-none-any.whl", hash = "sha256:a12ed0557c238897d8e518f1d1fdf84bd1516c5e305af2dacd85c2015115feb8", size = 210034, upload-time = "2025-06-01T10:19:20.436Z" },
2337
  ]
2338
 
2339
+ [[package]]
2340
+ name = "lightning"
2341
+ version = "2.6.0"
2342
+ source = { registry = "https://pypi.org/simple" }
2343
+ dependencies = [
2344
+ { name = "fsspec", extra = ["http"] },
2345
+ { name = "lightning-utilities" },
2346
+ { name = "packaging" },
2347
+ { name = "pytorch-lightning" },
2348
+ { name = "pyyaml" },
2349
+ { name = "torch" },
2350
+ { name = "torchmetrics" },
2351
+ { name = "tqdm" },
2352
+ { name = "typing-extensions" },
2353
+ ]
2354
+ sdist = { url = "https://files.pythonhosted.org/packages/1a/d5/892ea38816925b3511493e87b0b32494122bf8a20e66f4f2cd2667f95625/lightning-2.6.0.tar.gz", hash = "sha256:881841716b59c1837ae0c562c2e64fea9bcf49ef9de3867bd1f868557ec23d04", size = 656539, upload-time = "2025-11-28T09:34:25.069Z" }
2355
+ wheels = [
2356
+ { url = "https://files.pythonhosted.org/packages/d6/e9/36b340c7ec01dad6f034481e98fc9fc0133307beb05c714c0542af98bbde/lightning-2.6.0-py3-none-any.whl", hash = "sha256:f1a13a48909960a3454518486f113fae4fadb2db0e28e9c50d8d38d46c9dc3d6", size = 845956, upload-time = "2025-11-28T09:34:23.273Z" },
2357
+ ]
2358
+
2359
+ [[package]]
2360
+ name = "lightning-utilities"
2361
+ version = "0.15.2"
2362
+ source = { registry = "https://pypi.org/simple" }
2363
+ dependencies = [
2364
+ { name = "packaging" },
2365
+ { name = "setuptools" },
2366
+ { name = "typing-extensions" },
2367
+ ]
2368
+ sdist = { url = "https://files.pythonhosted.org/packages/b8/39/6fc58ca81492db047149b4b8fd385aa1bfb8c28cd7cacb0c7eb0c44d842f/lightning_utilities-0.15.2.tar.gz", hash = "sha256:cdf12f530214a63dacefd713f180d1ecf5d165338101617b4742e8f22c032e24", size = 31090, upload-time = "2025-08-06T13:57:39.242Z" }
2369
+ wheels = [
2370
+ { url = "https://files.pythonhosted.org/packages/de/73/3d757cb3fc16f0f9794dd289bcd0c4a031d9cf54d8137d6b984b2d02edf3/lightning_utilities-0.15.2-py3-none-any.whl", hash = "sha256:ad3ab1703775044bbf880dbf7ddaaac899396c96315f3aa1779cec9d618a9841", size = 29431, upload-time = "2025-08-06T13:57:38.046Z" },
2371
+ ]
2372
+
2373
  [[package]]
2374
  name = "loguru"
2375
  version = "0.7.3"
 
2599
  source = { editable = "." }
2600
  dependencies = [
2601
  { name = "gradio" },
2602
+ { name = "lightning" },
2603
  { name = "loguru" },
2604
  { name = "memory-profiler" },
2605
  { name = "mussel", extra = ["torch-gpu"] },
2606
  { name = "paladin" },
2607
+ { name = "seaborn" },
2608
  { name = "spaces" },
2609
+ { name = "statsmodels" },
2610
  ]
2611
 
2612
  [package.dev-dependencies]
 
2621
  [package.metadata]
2622
  requires-dist = [
2623
  { name = "gradio", specifier = ">=5.49.0" },
2624
+ { name = "lightning", specifier = ">=2.6.0" },
2625
  { name = "loguru", specifier = ">=0.7.3" },
2626
  { name = "memory-profiler", specifier = ">=0.61.0" },
2627
  { name = "mussel", extras = ["torch-gpu"], git = "https://github.com/pathology-data-mining/Mussel.git?rev=mosaic-dev" },
2628
  { name = "paladin", git = "ssh://git@github.com/pathology-data-mining/paladin.git?rev=dev" },
2629
+ { name = "seaborn", specifier = ">=0.13.2" },
2630
  { name = "spaces", specifier = ">=0.30.0" },
2631
+ { name = "statsmodels", specifier = ">=0.14.6" },
2632
  ]
2633
 
2634
  [package.metadata.requires-dev]
 
3239
 
3240
  [[package]]
3241
  name = "paladin"
3242
+ version = "0.0.0"
3243
+ source = { git = "ssh://git@github.com/pathology-data-mining/paladin.git?rev=dev#de6dab1a40948285d2e8aad322b9aca91ae669e6" }
3244
  dependencies = [
3245
  { name = "dvc" },
3246
  { name = "nn-template-core" },
 
3320
  { url = "https://files.pythonhosted.org/packages/cc/20/ff623b09d963f88bfde16306a54e12ee5ea43e9b597108672ff3a408aad6/pathspec-0.12.1-py3-none-any.whl", hash = "sha256:a0d503e138a4c123b27490a4f7beda6a01c6f288df0e4a8b79c7eb0dc7b4cc08", size = 31191, upload-time = "2023-12-10T22:30:43.14Z" },
3321
  ]
3322
 
3323
+ [[package]]
3324
+ name = "patsy"
3325
+ version = "1.0.2"
3326
+ source = { registry = "https://pypi.org/simple" }
3327
+ dependencies = [
3328
+ { name = "numpy" },
3329
+ ]
3330
+ sdist = { url = "https://files.pythonhosted.org/packages/be/44/ed13eccdd0519eff265f44b670d46fbb0ec813e2274932dc1c0e48520f7d/patsy-1.0.2.tar.gz", hash = "sha256:cdc995455f6233e90e22de72c37fcadb344e7586fb83f06696f54d92f8ce74c0", size = 399942, upload-time = "2025-10-20T16:17:37.535Z" }
3331
+ wheels = [
3332
+ { url = "https://files.pythonhosted.org/packages/f1/70/ba4b949bdc0490ab78d545459acd7702b211dfccf7eb89bbc1060f52818d/patsy-1.0.2-py2.py3-none-any.whl", hash = "sha256:37bfddbc58fcf0362febb5f54f10743f8b21dd2aa73dec7e7ef59d1b02ae668a", size = 233301, upload-time = "2025-10-20T16:17:36.563Z" },
3333
+ ]
3334
+
3335
  [[package]]
3336
  name = "pillow"
3337
  version = "11.3.0"
 
4898
  { url = "https://files.pythonhosted.org/packages/5c/2e/10b7fe92ddc69e5aae177775a3c8ed890bdd6cb40c2aa04e0a982937edd1/scmrepo-3.5.2-py3-none-any.whl", hash = "sha256:6e4660572b76512d0e013ca9806692188c736e8c9c76f833e3674fc21a558788", size = 73868, upload-time = "2025-08-06T14:46:31.635Z" },
4899
  ]
4900
 
4901
+ [[package]]
4902
+ name = "seaborn"
4903
+ version = "0.13.2"
4904
+ source = { registry = "https://pypi.org/simple" }
4905
+ dependencies = [
4906
+ { name = "matplotlib" },
4907
+ { name = "numpy" },
4908
+ { name = "pandas" },
4909
+ ]
4910
+ sdist = { url = "https://files.pythonhosted.org/packages/86/59/a451d7420a77ab0b98f7affa3a1d78a313d2f7281a57afb1a34bae8ab412/seaborn-0.13.2.tar.gz", hash = "sha256:93e60a40988f4d65e9f4885df477e2fdaff6b73a9ded434c1ab356dd57eefff7", size = 1457696, upload-time = "2024-01-25T13:21:52.551Z" }
4911
+ wheels = [
4912
+ { url = "https://files.pythonhosted.org/packages/83/11/00d3c3dfc25ad54e731d91449895a79e4bf2384dc3ac01809010ba88f6d5/seaborn-0.13.2-py3-none-any.whl", hash = "sha256:636f8336facf092165e27924f223d3c62ca560b1f2bb5dff7ab7fad265361987", size = 294914, upload-time = "2024-01-25T13:21:49.598Z" },
4913
+ ]
4914
+
4915
  [[package]]
4916
  name = "semantic-version"
4917
  version = "2.10.0"
 
5118
  { url = "https://files.pythonhosted.org/packages/be/72/2db2f49247d0a18b4f1bb9a5a39a0162869acf235f3a96418363947b3d46/starlette-0.48.0-py3-none-any.whl", hash = "sha256:0764ca97b097582558ecb498132ed0c7d942f233f365b86ba37770e026510659", size = 73736, upload-time = "2025-09-13T08:41:03.869Z" },
5119
  ]
5120
 
5121
+ [[package]]
5122
+ name = "statsmodels"
5123
+ version = "0.14.6"
5124
+ source = { registry = "https://pypi.org/simple" }
5125
+ dependencies = [
5126
+ { name = "numpy" },
5127
+ { name = "packaging" },
5128
+ { name = "pandas" },
5129
+ { name = "patsy" },
5130
+ { name = "scipy", version = "1.15.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
5131
+ { name = "scipy", version = "1.16.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
5132
+ ]
5133
+ sdist = { url = "https://files.pythonhosted.org/packages/0d/81/e8d74b34f85285f7335d30c5e3c2d7c0346997af9f3debf9a0a9a63de184/statsmodels-0.14.6.tar.gz", hash = "sha256:4d17873d3e607d398b85126cd4ed7aad89e4e9d89fc744cdab1af3189a996c2a", size = 20689085, upload-time = "2025-12-05T23:08:39.522Z" }
5134
+ wheels = [
5135
+ { url = "https://files.pythonhosted.org/packages/b5/6d/9ec309a175956f88eb8420ac564297f37cf9b1f73f89db74da861052dc29/statsmodels-0.14.6-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:f4ff0649a2df674c7ffb6fa1a06bffdb82a6adf09a48e90e000a15a6aaa734b0", size = 10142419, upload-time = "2025-12-05T19:27:35.625Z" },
5136
+ { url = "https://files.pythonhosted.org/packages/86/8f/338c5568315ec5bf3ac7cd4b71e34b98cb3b0f834919c0c04a0762f878a1/statsmodels-0.14.6-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:109012088b3e370080846ab053c76d125268631410142daad2f8c10770e8e8d9", size = 10022819, upload-time = "2025-12-05T19:27:49.385Z" },
5137
+ { url = "https://files.pythonhosted.org/packages/b0/77/5fc4cbc2d608f9b483b0675f82704a8bcd672962c379fe4d82100d388dbf/statsmodels-0.14.6-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e93bd5d220f3cb6fc5fc1bffd5b094966cab8ee99f6c57c02e95710513d6ac3f", size = 10118927, upload-time = "2025-12-05T23:07:51.256Z" },
5138
+ { url = "https://files.pythonhosted.org/packages/94/55/b86c861c32186403fe121d9ab27bc16d05839b170d92a978beb33abb995e/statsmodels-0.14.6-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:06eec42d682fdb09fe5d70a05930857efb141754ec5a5056a03304c1b5e32fd9", size = 10413015, upload-time = "2025-12-05T23:08:53.95Z" },
5139
+ { url = "https://files.pythonhosted.org/packages/f9/be/daf0dba729ccdc4176605f4a0fd5cfe71cdda671749dca10e74a732b8b1c/statsmodels-0.14.6-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:0444e88557df735eda7db330806fe09d51c9f888bb1f5906cb3a61fb1a3ed4a8", size = 10441248, upload-time = "2025-12-05T23:09:09.353Z" },
5140
+ { url = "https://files.pythonhosted.org/packages/9a/1c/2e10b7c7cc44fa418272996bf0427b8016718fd62f995d9c1f7ab37adf35/statsmodels-0.14.6-cp310-cp310-win_amd64.whl", hash = "sha256:e83a9abe653835da3b37fb6ae04b45480c1de11b3134bd40b09717192a1456ea", size = 9583410, upload-time = "2025-12-05T19:28:02.086Z" },
5141
+ { url = "https://files.pythonhosted.org/packages/a9/4d/df4dd089b406accfc3bb5ee53ba29bb3bdf5ae61643f86f8f604baa57656/statsmodels-0.14.6-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:6ad5c2810fc6c684254a7792bf1cbaf1606cdee2a253f8bd259c43135d87cfb4", size = 10121514, upload-time = "2025-12-05T19:28:16.521Z" },
5142
+ { url = "https://files.pythonhosted.org/packages/82/af/ec48daa7f861f993b91a0dcc791d66e1cf56510a235c5cbd2ab991a31d5c/statsmodels-0.14.6-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:341fa68a7403e10a95c7b6e41134b0da3a7b835ecff1eb266294408535a06eb6", size = 10003346, upload-time = "2025-12-05T19:28:29.568Z" },
5143
+ { url = "https://files.pythonhosted.org/packages/a9/2c/c8f7aa24cd729970728f3f98822fb45149adc216f445a9301e441f7ac760/statsmodels-0.14.6-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bdf1dfe2a3ca56f5529118baf33a13efed2783c528f4a36409b46bbd2d9d48eb", size = 10129872, upload-time = "2025-12-05T23:09:25.724Z" },
5144
+ { url = "https://files.pythonhosted.org/packages/40/c6/9ae8e9b0721e9b6eb5f340c3a0ce8cd7cce4f66e03dd81f80d60f111987f/statsmodels-0.14.6-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a3764ba8195c9baf0925a96da0743ff218067a269f01d155ca3558deed2658ca", size = 10381964, upload-time = "2025-12-05T23:09:41.326Z" },
5145
+ { url = "https://files.pythonhosted.org/packages/28/8c/cf3d30c8c2da78e2ad1f50ade8b7fabec3ff4cdfc56fbc02e097c4577f90/statsmodels-0.14.6-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:9e8d2e519852adb1b420e018f5ac6e6684b2b877478adf7fda2cfdb58f5acb5d", size = 10409611, upload-time = "2025-12-05T23:09:57.131Z" },
5146
+ { url = "https://files.pythonhosted.org/packages/bf/cc/018f14ecb58c6cb89de9d52695740b7d1f5a982aa9ea312483ea3c3d5f77/statsmodels-0.14.6-cp311-cp311-win_amd64.whl", hash = "sha256:2738a00fca51196f5a7d44b06970ace6b8b30289839e4808d656f8a98e35faa7", size = 9580385, upload-time = "2025-12-05T19:28:42.778Z" },
5147
+ { url = "https://files.pythonhosted.org/packages/25/ce/308e5e5da57515dd7cab3ec37ea2d5b8ff50bef1fcc8e6d31456f9fae08e/statsmodels-0.14.6-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:fe76140ae7adc5ff0e60a3f0d56f4fffef484efa803c3efebf2fcd734d72ecb5", size = 10091932, upload-time = "2025-12-05T19:28:55.446Z" },
5148
+ { url = "https://files.pythonhosted.org/packages/05/30/affbabf3c27fb501ec7b5808230c619d4d1a4525c07301074eb4bda92fa9/statsmodels-0.14.6-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:26d4f0ed3b31f3c86f83a92f5c1f5cbe63fc992cd8915daf28ca49be14463a1c", size = 9997345, upload-time = "2025-12-05T19:29:10.278Z" },
5149
+ { url = "https://files.pythonhosted.org/packages/48/f5/3a73b51e6450c31652c53a8e12e24eac64e3824be816c0c2316e7dbdcb7d/statsmodels-0.14.6-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d8c00a42863e4f4733ac9d078bbfad816249c01451740e6f5053ecc7db6d6368", size = 10058649, upload-time = "2025-12-05T23:10:12.775Z" },
5150
+ { url = "https://files.pythonhosted.org/packages/81/68/dddd76117df2ef14c943c6bbb6618be5c9401280046f4ddfc9fb4596a1b8/statsmodels-0.14.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:19b58cf7474aa9e7e3b0771a66537148b2df9b5884fbf156096c0e6c1ff0469d", size = 10339446, upload-time = "2025-12-05T23:10:28.503Z" },
5151
+ { url = "https://files.pythonhosted.org/packages/56/4a/dce451c74c4050535fac1ec0c14b80706d8fc134c9da22db3c8a0ec62c33/statsmodels-0.14.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:81e7dcc5e9587f2567e52deaff5220b175bf2f648951549eae5fc9383b62bc37", size = 10368705, upload-time = "2025-12-05T23:10:44.339Z" },
5152
+ { url = "https://files.pythonhosted.org/packages/60/15/3daba2df40be8b8a9a027d7f54c8dedf24f0d81b96e54b52293f5f7e3418/statsmodels-0.14.6-cp312-cp312-win_amd64.whl", hash = "sha256:b5eb07acd115aa6208b4058211138393a7e6c2cf12b6f213ede10f658f6a714f", size = 9543991, upload-time = "2025-12-05T23:10:58.536Z" },
5153
+ { url = "https://files.pythonhosted.org/packages/81/59/a5aad5b0cc266f5be013db8cde563ac5d2a025e7efc0c328d83b50c72992/statsmodels-0.14.6-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:47ee7af083623d2091954fa71c7549b8443168f41b7c5dce66510274c50fd73e", size = 10072009, upload-time = "2025-12-05T23:11:14.021Z" },
5154
+ { url = "https://files.pythonhosted.org/packages/53/dd/d8cfa7922fc6dc3c56fa6c59b348ea7de829a94cd73208c6f8202dd33f17/statsmodels-0.14.6-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:aa60d82e29fcd0a736e86feb63a11d2380322d77a9369a54be8b0965a3985f71", size = 9980018, upload-time = "2025-12-05T23:11:30.907Z" },
5155
+ { url = "https://files.pythonhosted.org/packages/ee/77/0ec96803eba444efd75dba32f2ef88765ae3e8f567d276805391ec2c98c6/statsmodels-0.14.6-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:89ee7d595f5939cc20bf946faedcb5137d975f03ae080f300ebb4398f16a5bd4", size = 10060269, upload-time = "2025-12-05T23:11:46.338Z" },
5156
+ { url = "https://files.pythonhosted.org/packages/10/b9/fd41f1f6af13a1a1212a06bb377b17762feaa6d656947bf666f76300fc05/statsmodels-0.14.6-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:730f3297b26749b216a06e4327fe0be59b8d05f7d594fb6caff4287b69654589", size = 10324155, upload-time = "2025-12-05T23:12:01.805Z" },
5157
+ { url = "https://files.pythonhosted.org/packages/ee/0f/a6900e220abd2c69cd0a07e3ad26c71984be6061415a60e0f17b152ecf08/statsmodels-0.14.6-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:f1c08befa85e93acc992b72a390ddb7bd876190f1360e61d10cf43833463bc9c", size = 10349765, upload-time = "2025-12-05T23:12:18.018Z" },
5158
+ { url = "https://files.pythonhosted.org/packages/98/08/b79f0c614f38e566eebbdcff90c0bcacf3c6ba7a5bbb12183c09c29ca400/statsmodels-0.14.6-cp313-cp313-win_amd64.whl", hash = "sha256:8021271a79f35b842c02a1794465a651a9d06ec2080f76ebc3b7adce77d08233", size = 9540043, upload-time = "2025-12-05T23:12:33.887Z" },
5159
+ { url = "https://files.pythonhosted.org/packages/71/de/09540e870318e0c7b58316561d417be45eff731263b4234fdd2eee3511a8/statsmodels-0.14.6-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:00781869991f8f02ad3610da6627fd26ebe262210287beb59761982a8fa88cae", size = 10069403, upload-time = "2025-12-05T23:12:48.424Z" },
5160
+ { url = "https://files.pythonhosted.org/packages/ab/f0/63c1bfda75dc53cee858006e1f46bd6d6f883853bea1b97949d0087766ca/statsmodels-0.14.6-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:73f305fbf31607b35ce919fae636ab8b80d175328ed38fdc6f354e813b86ee37", size = 9989253, upload-time = "2025-12-05T23:13:05.274Z" },
5161
+ { url = "https://files.pythonhosted.org/packages/c1/98/b0dfb4f542b2033a3341aa5f1bdd97024230a4ad3670c5b0839d54e3dcab/statsmodels-0.14.6-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e443e7077a6e2d3faeea72f5a92c9f12c63722686eb80bb40a0f04e4a7e267ad", size = 10090802, upload-time = "2025-12-05T23:13:20.653Z" },
5162
+ { url = "https://files.pythonhosted.org/packages/34/0e/2408735aca9e764643196212f9069912100151414dd617d39ffc72d77eee/statsmodels-0.14.6-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3414e40c073d725007a6603a18247ab7af3467e1af4a5e5a24e4c27bc26673b4", size = 10337587, upload-time = "2025-12-05T23:13:37.597Z" },
5163
+ { url = "https://files.pythonhosted.org/packages/0f/36/4d44f7035ab3c0b2b6a4c4ebb98dedf36246ccbc1b3e2f51ebcd7ac83abb/statsmodels-0.14.6-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:a518d3f9889ef920116f9fa56d0338069e110f823926356946dae83bc9e33e19", size = 10363350, upload-time = "2025-12-05T23:13:53.08Z" },
5164
+ { url = "https://files.pythonhosted.org/packages/26/33/f1652d0c59fa51de18492ee2345b65372550501ad061daa38f950be390b6/statsmodels-0.14.6-cp314-cp314-win_amd64.whl", hash = "sha256:151b73e29f01fe619dbce7f66d61a356e9d1fe5e906529b78807df9189c37721", size = 9588010, upload-time = "2025-12-05T23:14:07.28Z" },
5165
+ ]
5166
+
5167
  [[package]]
5168
  name = "stqdm"
5169
  version = "0.0.5"