File size: 5,959 Bytes
0506a57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
# Mosaic Scripts

This directory contains utility scripts for working with the Mosaic pipeline, particularly for Aeon model testing and deployment.

## Aeon Model Scripts

### 1. export_aeon_checkpoint.py

Export PyTorch Lightning checkpoint to pickle format for inference.

**Usage:**
```bash
python scripts/export_aeon_checkpoint.py \
    --checkpoint data/checkpoint.ckpt \
    --output data/aeon_model.pkl \
    --metadata-dir data/metadata
```

**Arguments:**
- `--checkpoint`: Path to PyTorch Lightning checkpoint (.ckpt file)
- `--output`: Path to save exported model (.pkl file)
- `--metadata-dir`: Directory containing metadata files (default: data/metadata)

**Requirements:**
- paladin package from git repo (must have AeonLightningModule)
- PyTorch Lightning
- Metadata files: n_classes.txt, ontology_embedding_dim.txt, target_dict.tsv

**Example:**
```bash
# Export the checkpoint
uv run python scripts/export_aeon_checkpoint.py \
    --checkpoint data/checkpoint.ckpt \
    --output data/aeon_model.pkl

# Output:
# Loading metadata from data/metadata...
# Loading checkpoint from data/checkpoint.ckpt...
# Saving model to data/aeon_model.pkl...
# ✓ Successfully exported checkpoint to data/aeon_model.pkl
#   Model size: 118.0 MB
#   Model class: AeonLateAggregator
#   Number of classes: 160
#   Ontology embedding dim: 20
#   Number of histologies: 160
```

### 2. run_aeon_tests.sh

Run the Aeon model on test slides and validate predictions.

**Usage:**
```bash
./scripts/run_aeon_tests.sh
```

**Configuration:**
The script reads test samples from `test_slides/test_samples.json` and processes each slide through the full Mosaic pipeline with:
- Cancer subtype: Unknown (triggers Aeon inference)
- Segmentation config: Biopsy
- Number of workers: 4

**Output:**
- Results saved to `test_slides/results/{slide_id}/`
- Logs saved to `test_slides/logs/`
- Summary showing passed/failed tests

**Example Output:**
```
=========================================
Aeon Model Test Suite
=========================================

Found 3 test slides

=========================================
Processing slide 1/3: 881837
=========================================
Ground Truth:
  Cancer Subtype: BLCA
  Site Type: Primary
  Sex: Male
  Tissue Site: Bladder

Running Mosaic pipeline...

Aeon Prediction:
  Predicted: BLCA
  Confidence: 0.9819

✓ PASS: Prediction matches ground truth

[... continues for all slides ...]

=========================================
Test Summary
=========================================
Total slides: 3
Passed: 3
Failed: 0

All tests passed!
```

### 3. verify_aeon_results.py

Verify Aeon test results against expected ground truth.

**Usage:**
```bash
python scripts/verify_aeon_results.py \
    --test-samples test_slides/test_samples.json \
    --results-dir test_slides/results \
    --output test_slides/verification_report.json
```

**Arguments:**
- `--test-samples`: Path to test samples JSON file (default: test_slides/test_samples.json)
- `--results-dir`: Directory containing results (default: test_slides/results)
- `--output`: Optional path to save verification report as JSON

**Example:**
```bash
# Verify results and save report
uv run python scripts/verify_aeon_results.py \
    --output test_slides/verification_report.json

# Output:
# ================================================================================
# Aeon Model Verification Report
# ================================================================================
#
# Slide: 881837
#   Ground Truth: BLCA
#   Site Type: Primary
#   Sex: Male
#   Tissue Site: Bladder
#   Predicted: BLCA
#   Confidence: 0.9819 (98.19%)
#   Status: ✓ PASS
#
# [... continues for all slides ...]
#
# ================================================================================
# Summary
# ================================================================================
# Total slides: 3
# Passed: 3 (100.0%)
# Failed: 0 (0.0%)
#
# ✓ All tests passed!
#
# Confidence Statistics (for passed tests):
#   Average: 0.9910 (99.10%)
#   Minimum: 0.9819 (98.19%)
#   Maximum: 0.9961 (99.61%)
```

## Workflow

### Complete Testing Workflow

1. **Export checkpoint** (if needed):
   ```bash
   uv run python scripts/export_aeon_checkpoint.py \
       --checkpoint data/checkpoint.ckpt \
       --output data/aeon_model.pkl
   ```

2. **Run tests**:
   ```bash
   ./scripts/run_aeon_tests.sh
   ```

3. **Verify results**:
   ```bash
   uv run python scripts/verify_aeon_results.py \
       --output test_slides/verification_report.json
   ```

### Quick Verification

If you already have test results and just want to verify them:

```bash
uv run python scripts/verify_aeon_results.py
```

## Test Samples Format

The test samples JSON file should have this format:

```json
[
  {
    "slide_id": "881837",
    "cancer_subtype": "BLCA",
    "site_type": "Primary",
    "sex": "Male",
    "tissue_site": "Bladder"
  },
  {
    "slide_id": "744547",
    "cancer_subtype": "HCC",
    "site_type": "Metastatic",
    "sex": "Male",
    "tissue_site": "Liver"
  }
]
```

## Dependencies

All scripts require:
- Python 3.10+
- uv package manager
- Mosaic package with dependencies

Additional requirements for checkpoint export:
- paladin from git repository (dev branch)
- PyTorch Lightning

## Exit Codes

- `0`: Success (all tests passed)
- `1`: Failure (one or more tests failed)

## Troubleshooting

### "AeonLightningModule not found"
```bash
uv sync --upgrade-package paladin
```

### "Metadata files not found"
Make sure you have:
- `data/metadata/n_classes.txt`
- `data/metadata/ontology_embedding_dim.txt`
- `data/metadata/target_dict.tsv`

### "Test slides not found"
Place your test slides in `test_slides/` directory and update `test_samples.json` with correct paths.

## See Also

- [AEON_TEST_SUMMARY.md](../test_slides/AEON_TEST_SUMMARY.md) - Detailed test results and validation
- [README.md](../README.md) - Main Mosaic documentation