raylim commited on
Commit
343d8bf
·
1 Parent(s): 847294a

fix: resolve column mismatch error in get_settings function

Browse files

Removed ihc_subtype parameter from get_settings() function to match
SETTINGS_COLUMNS which has 6 columns (IHC Subtype is commented out).
This fixes ValueError: 6 columns passed, passed data had 7 columns.

Changes:
- Remove ihc_subtype from get_settings() signature
- Remove ihc_subtype from settings list in get_settings()
- Remove ihc_subtype argument from get_settings() call in update_files()

All 410 tests pass.

README.md CHANGED
@@ -27,6 +27,11 @@ Mosaic is a deep learning model designed for predicting cancer subtypes and biom
27
  - [Output Files](#output-files)
28
  - [Examples](#examples)
29
  - [Advanced Usage](#advanced-usage)
 
 
 
 
 
30
  - [CSV File Format](#csv-file-format)
31
  - [Cancer Subtypes](#cancer-subtypes)
32
  - [Troubleshooting](#troubleshooting)
@@ -330,6 +335,116 @@ mosaic --debug
330
 
331
  This will create a `debug.log` file with detailed information about the processing steps.
332
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
333
  ## CSV File Format
334
 
335
  When processing multiple slides using the `--slide-csv` option, the CSV file must contain the following columns:
 
27
  - [Output Files](#output-files)
28
  - [Examples](#examples)
29
  - [Advanced Usage](#advanced-usage)
30
+ - [User Storage Management (HF Spaces)](#user-storage-management-hf-spaces)
31
+ - [My Files Tab](#my-files-tab)
32
+ - [My Results Tab](#my-results-tab)
33
+ - [Storage Quotas](#storage-quotas)
34
+ - [Local Debug Mode](#local-debug-mode)
35
  - [CSV File Format](#csv-file-format)
36
  - [Cancer Subtypes](#cancer-subtypes)
37
  - [Troubleshooting](#troubleshooting)
 
335
 
336
  This will create a `debug.log` file with detailed information about the processing steps.
337
 
338
+ ## User Storage Management (HF Spaces)
339
+
340
+ When running Mosaic on HuggingFace Spaces, logged-in users have access to ephemeral file storage for uploaded slides and analysis results. This feature allows you to:
341
+
342
+ - **Re-analyze slides** with different settings without re-uploading
343
+ - **View previous analysis results** from past sessions
344
+ - **Download results** at any time during your session
345
+
346
+ **Important**: All stored files are **ephemeral** and will be deleted when the HuggingFace Spaces instance restarts. This typically happens during updates or when the instance is idle for extended periods.
347
+
348
+ ### My Files Tab
349
+
350
+ The **My Files** tab (visible only when logged in) provides access to your uploaded slides:
351
+
352
+ **Features:**
353
+ - **Storage usage display**: Shows current usage vs. quota (e.g., "2.3 GB / 5 GB")
354
+ - **Color-coded warnings**:
355
+ - 💾 Normal: < 80% quota used
356
+ - ⚠️ Warning: 80-99% quota used (delete old files to free space)
357
+ - ⛔ Error: ≥ 100% quota exceeded (upload blocked until space freed)
358
+ - **File browser**: View all uploaded slides with:
359
+ - Slide ID (unique identifier)
360
+ - Original filename
361
+ - File size
362
+ - Upload date
363
+ - Number of analyses performed
364
+ - **File actions**:
365
+ - **Download**: Download original slide file
366
+ - **Delete**: Remove slide and all associated analysis results
367
+ - **Refresh**: Update file list
368
+
369
+ **Typical workflow:**
370
+ 1. Upload slides via main analysis tab
371
+ 2. Review uploaded files in My Files tab
372
+ 3. Delete old slides when approaching quota limit
373
+
374
+ ### My Results Tab
375
+
376
+ The **My Results** tab (visible only when logged in) displays all analysis results from your session:
377
+
378
+ **Features:**
379
+ - **Results browser**: View all analyses with:
380
+ - Analysis ID (unique identifier)
381
+ - Slide name
382
+ - Analysis date/time
383
+ - Predicted cancer subtype
384
+ - Analysis settings (sex, tissue site, site type)
385
+ - **Result viewer**: Select an analysis to view:
386
+ - Full metadata (settings, timestamps)
387
+ - Tissue segmentation mask (PNG)
388
+ - Aeon predictions (top cancer subtypes with confidence scores)
389
+ - Paladin biomarker predictions (if applicable)
390
+ - **Result actions**:
391
+ - **View**: Display full analysis details
392
+ - **Download ZIP**: Download all results as a ZIP file
393
+ - **Delete**: Remove specific analysis result
394
+ - **Refresh**: Update results list
395
+
396
+ **Download ZIP contents:**
397
+ ```
398
+ {analysis_id}.zip
399
+ ├── metadata.json # Analysis settings and timestamps
400
+ ├── slide_mask.png # Tissue segmentation visualization
401
+ ├── {analysis_id}_aeon_results.csv # Cancer subtype predictions
402
+ └── {analysis_id}_paladin_results.csv # Biomarker predictions (if available)
403
+ ```
404
+
405
+ ### Storage Quotas
406
+
407
+ **Per-user quota**: 5 GB (default)
408
+
409
+ This limit is enforced to prevent disk exhaustion on shared HuggingFace Spaces instances. When you approach or exceed your quota:
410
+
411
+ 1. **Automatic cleanup**: Oldest files are deleted automatically (FIFO - First In, First Out)
412
+ 2. **Manual cleanup**: You can delete files manually via the My Files tab
413
+ 3. **Upload blocking**: New uploads are blocked at 100% quota until space is freed
414
+
415
+ **Typical storage usage:**
416
+ - Small WSI (biopsy): 100-300 MB
417
+ - Medium WSI (tissue section): 500 MB - 1 GB
418
+ - Large WSI (whole tissue): 1-2 GB
419
+ - Analysis results: ~1-2 MB each (negligible)
420
+
421
+ **Example**: With a 5 GB quota, you can store approximately **5-10 slides** concurrently.
422
+
423
+ ### Local Debug Mode
424
+
425
+ When running Mosaic **locally** (not on HuggingFace Spaces), the My Files and My Results tabs are still available for debugging:
426
+
427
+ - All files are stored under a **universal "local_user"** username
428
+ - Storage path: `/tmp/mosaic_user_data/local_user/`
429
+ - UI shows **🔧 [Local Debug Mode]** indicator
430
+ - Files are still ephemeral (cleared on system reboot)
431
+ - No authentication required
432
+
433
+ This mode is useful for:
434
+ - Testing the storage feature locally
435
+ - Debugging upload/result workflows
436
+ - Development and testing
437
+
438
+ **Enable local debug mode:**
439
+ ```bash
440
+ # Simply run locally (not on HuggingFace Spaces)
441
+ make run-ui
442
+ # or
443
+ mosaic
444
+ ```
445
+
446
+ The tabs will automatically detect local mode and show the debug indicator.
447
+
448
  ## CSV File Format
449
 
450
  When processing multiple slides using the `--slide-csv` option, the CSV file must contain the following columns:
src/mosaic/gradio_app.py CHANGED
@@ -312,7 +312,7 @@ def main():
312
  args.sex,
313
  args.tissue_site,
314
  args.cancer_subtype,
315
- args.ihc_subtype,
316
  args.segmentation_config,
317
  ]
318
  ],
 
312
  args.sex,
313
  args.tissue_site,
314
  args.cancer_subtype,
315
+ # args.ihc_subtype, # Not yet in use
316
  args.segmentation_config,
317
  ]
318
  ],
src/mosaic/ui/app.py CHANGED
@@ -189,6 +189,7 @@ def analyze_slides(
189
  ihc_subtype,
190
  seg_config,
191
  user_dir,
 
192
  progress=gr.Progress(track_tqdm=True),
193
  request: gr.Request = None,
194
  profile: Optional[gr.OAuthProfile] = None,
@@ -507,6 +508,63 @@ def analyze_slides(
507
  f"Results are still valid but won't be cached for future use."
508
  )
509
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
510
  if slide_mask is not None:
511
  mask_filename = f"{Path(slide_name).stem}_mask.png"
512
  saved = _save_mask_as_png(slide_mask, mask_filename)
@@ -679,6 +737,9 @@ def launch_gradio(server_name, server_port, share):
679
 
680
  with gr.Blocks(title="Mosaic") as demo:
681
  user_dir_state = gr.State(None)
 
 
 
682
 
683
  # Add login button for OAuth (only active on HF Spaces)
684
  gr.LoginButton()
@@ -879,6 +940,24 @@ This tool is for research purposes only and not approved for clinical diagnosis.
879
  file_count="multiple",
880
  )
881
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
882
  # TCGA fetch components (hidden by default)
883
  with gr.Group(visible=False) as tcga_input_group:
884
  tcga_id_input = gr.Textbox(
@@ -1002,7 +1081,7 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1002
  )
1003
 
1004
  def get_settings(
1005
- files, site_type, sex, tissue_site, cancer_subtype, ihc_subtype, seg_config
1006
  ):
1007
  """Generate initial settings DataFrame from uploaded files and dropdown values."""
1008
  if files is None:
@@ -1018,7 +1097,6 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1018
  sex if sex is not None else "",
1019
  tissue_site,
1020
  cancer_subtype,
1021
- ihc_subtype,
1022
  seg_config,
1023
  ]
1024
  )
@@ -1062,6 +1140,148 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1062
  gr.File(visible=False),
1063
  )
1064
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1065
  # Handle TCGA slide fetch
1066
  def fetch_tcga_slide(tcga_id, seg_config, progress=gr.Progress()):
1067
  """Fetch a slide from TCGA and pre-fill metadata."""
@@ -1191,6 +1411,97 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1191
  )
1192
 
1193
  # Handle file uploads - regenerate entire settings table
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1194
  @input_slides.change(
1195
  inputs=[
1196
  input_slides,
@@ -1216,7 +1527,6 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1216
  sex,
1217
  tissue_site,
1218
  cancer_subtype,
1219
- ihc_subtype,
1220
  seg_config,
1221
  )
1222
  # if settings_df is not None:
@@ -1326,6 +1636,7 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1326
  ihc_subtype,
1327
  seg_config,
1328
  user_dir,
 
1329
  progress=gr.Progress(track_tqdm=True),
1330
  request: gr.Request = None,
1331
  profile: Optional[gr.OAuthProfile] = None,
@@ -1371,6 +1682,7 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1371
  ihc_subtype,
1372
  seg_config,
1373
  user_dir,
 
1374
  progress=progress,
1375
  request=request,
1376
  profile=profile,
@@ -1390,6 +1702,7 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1390
  ihc_subtype_dropdown,
1391
  seg_config_dropdown,
1392
  user_dir_state,
 
1393
  ],
1394
  outputs=[
1395
  settings_input,
@@ -1456,6 +1769,70 @@ This tool is for research purposes only and not approved for clinical diagnosis.
1456
  outputs=[user_dir_state],
1457
  )
1458
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1459
  # Use hardware-specific concurrency limit
1460
  # T4 GPUs (16GB) can only handle one analysis at a time to prevent OOM
1461
  # Higher-memory GPUs and ZeroGPU can handle multiple concurrent analyses
 
189
  ihc_subtype,
190
  seg_config,
191
  user_dir,
192
+ slide_ids=None, # Mapping from filename to slide_id
193
  progress=gr.Progress(track_tqdm=True),
194
  request: gr.Request = None,
195
  profile: Optional[gr.OAuthProfile] = None,
 
508
  f"Results are still valid but won't be cached for future use."
509
  )
510
 
511
+ # Save results for logged-in users (best-effort, non-blocking)
512
+ # On HF Spaces: only for logged-in users
513
+ # Locally: always save for 'local_user' for debugging
514
+ should_save_results = False
515
+ save_username = None
516
+
517
+ if IS_HF_SPACES:
518
+ if user_info.is_logged_in and user_info.username:
519
+ should_save_results = True
520
+ save_username = user_info.username
521
+ else:
522
+ # Local mode - always save for debugging
523
+ from mosaic.ui.user_tabs import LOCAL_DEBUG_USERNAME
524
+
525
+ should_save_results = True
526
+ save_username = LOCAL_DEBUG_USERNAME
527
+
528
+ if should_save_results and save_username:
529
+ try:
530
+ from mosaic.user_results import (
531
+ save_analysis_results as save_user_results,
532
+ )
533
+
534
+ # Prepare settings dict for metadata
535
+ settings_dict = {
536
+ "seg_config": row["Segmentation Config"],
537
+ "site_type": row["Site Type"],
538
+ "sex": row["Sex"],
539
+ "tissue_site": row.get("Tissue Site", "Unknown"),
540
+ "cancer_subtype": row["Cancer Subtype"],
541
+ # "ihc_subtype": row.get("IHC Subtype", ""), # Not yet in use
542
+ }
543
+
544
+ # Use analysis_id + slide index for unique ID per slide
545
+ slide_analysis_id = f"{analysis_id}_{idx}"
546
+
547
+ # Get slide_id from tracking (if available)
548
+ file_slide_id = ""
549
+ if slide_ids and slide_name in slide_ids:
550
+ file_slide_id = slide_ids[slide_name]
551
+
552
+ save_user_results(
553
+ username=save_username,
554
+ analysis_id=slide_analysis_id,
555
+ slide_id=file_slide_id,
556
+ slide_name=slide_name,
557
+ settings=settings_dict,
558
+ aeon_results=aeon_results,
559
+ paladin_results=paladin_results,
560
+ slide_mask=slide_mask,
561
+ )
562
+ logger.debug(
563
+ f"Saved results for user {save_username}, slide {slide_name}"
564
+ )
565
+ except Exception as e:
566
+ logger.warning(f"Failed to save user results (non-fatal): {e}")
567
+
568
  if slide_mask is not None:
569
  mask_filename = f"{Path(slide_name).stem}_mask.png"
570
  saved = _save_mask_as_png(slide_mask, mask_filename)
 
737
 
738
  with gr.Blocks(title="Mosaic") as demo:
739
  user_dir_state = gr.State(None)
740
+ slide_ids_state = gr.State(
741
+ {}
742
+ ) # Maps slide filename to slide_id for user storage
743
 
744
  # Add login button for OAuth (only active on HF Spaces)
745
  gr.LoginButton()
 
940
  file_count="multiple",
941
  )
942
 
943
+ # Storage usage indicator (shown for logged-in users or local mode)
944
+ storage_usage_warning = gr.Markdown(visible=False)
945
+
946
+ # Analyze Existing File option (shown for logged-in users or local mode)
947
+ with gr.Row(visible=False) as existing_file_row:
948
+ gr.Markdown("**Or select an existing slide:**")
949
+ existing_slides_dropdown = gr.Dropdown(
950
+ label="Previously Uploaded Slides",
951
+ choices=[],
952
+ value=None,
953
+ interactive=True,
954
+ multiselect=True,
955
+ info="Select one or more slides you've already uploaded to re-analyze",
956
+ )
957
+ refresh_existing_btn = gr.Button(
958
+ "🔄 Refresh List", size="sm", variant="secondary"
959
+ )
960
+
961
  # TCGA fetch components (hidden by default)
962
  with gr.Group(visible=False) as tcga_input_group:
963
  tcga_id_input = gr.Textbox(
 
1081
  )
1082
 
1083
  def get_settings(
1084
+ files, site_type, sex, tissue_site, cancer_subtype, seg_config
1085
  ):
1086
  """Generate initial settings DataFrame from uploaded files and dropdown values."""
1087
  if files is None:
 
1097
  sex if sex is not None else "",
1098
  tissue_site,
1099
  cancer_subtype,
 
1100
  seg_config,
1101
  ]
1102
  )
 
1140
  gr.File(visible=False),
1141
  )
1142
 
1143
+ # Load storage usage and show warnings
1144
+ def load_storage_usage(request: gr.Request = None):
1145
+ """Load and display storage usage with warnings."""
1146
+ from mosaic.ui.user_tabs import _get_username
1147
+
1148
+ username, is_local = _get_username(request)
1149
+
1150
+ if not username:
1151
+ # Not logged in on HF Spaces - hide storage info
1152
+ return gr.Markdown(visible=False), gr.Row(visible=False)
1153
+
1154
+ try:
1155
+ from mosaic.user_storage import get_storage_usage
1156
+
1157
+ usage = get_storage_usage(username)
1158
+ quota_gb = usage.quota_bytes / (1024**3)
1159
+ used_gb = usage.total_bytes / (1024**3)
1160
+ percent = (
1161
+ (usage.total_bytes / usage.quota_bytes * 100)
1162
+ if usage.quota_bytes > 0
1163
+ else 0
1164
+ )
1165
+
1166
+ mode_indicator = " 🔧 [Local Debug]" if is_local else ""
1167
+
1168
+ # Build warning message
1169
+ if percent >= 100:
1170
+ warning_md = (
1171
+ f"⛔ **Storage quota exceeded!** {used_gb:.2f} GB / {quota_gb:.0f} GB "
1172
+ f"({percent:.0f}%){mode_indicator}\n\n"
1173
+ f"Delete old files from the **My Files** tab before uploading."
1174
+ )
1175
+ warning_color = "red"
1176
+ elif percent >= 80:
1177
+ warning_md = (
1178
+ f"⚠️ **Storage nearly full:** {used_gb:.2f} GB / {quota_gb:.0f} GB "
1179
+ f"({percent:.0f}%){mode_indicator}\n\n"
1180
+ f"Consider deleting old files from the **My Files** tab."
1181
+ )
1182
+ warning_color = "orange"
1183
+ else:
1184
+ warning_md = (
1185
+ f"💾 **Storage:** {used_gb:.2f} GB / {quota_gb:.0f} GB "
1186
+ f"({percent:.0f}%) | {usage.file_count} files{mode_indicator}"
1187
+ )
1188
+ warning_color = "gray"
1189
+
1190
+ return gr.Markdown(value=warning_md, visible=True), gr.Row(visible=True)
1191
+
1192
+ except Exception as e:
1193
+ logger.error(f"Failed to load storage usage: {e}")
1194
+ return gr.Markdown(visible=False), gr.Row(visible=False)
1195
+
1196
+ # Load list of existing slides for re-analysis
1197
+ def load_existing_slides(request: gr.Request = None):
1198
+ """Load dropdown with user's existing slides."""
1199
+ from mosaic.ui.user_tabs import _get_username
1200
+
1201
+ username, _ = _get_username(request)
1202
+
1203
+ if not username:
1204
+ return gr.Dropdown(choices=[], value=None)
1205
+
1206
+ try:
1207
+ from mosaic.user_storage import list_user_slides
1208
+
1209
+ slides = list_user_slides(username)
1210
+
1211
+ if not slides:
1212
+ return gr.Dropdown(
1213
+ choices=[], value=None, info="No slides uploaded yet"
1214
+ )
1215
+
1216
+ # Create dropdown choices: display name → slide_id
1217
+ choices = [
1218
+ (
1219
+ f"{slide.original_filename} ({slide.size_bytes / (1024**2):.1f} MB)",
1220
+ slide.slide_id,
1221
+ )
1222
+ for slide in slides
1223
+ ]
1224
+
1225
+ return gr.Dropdown(
1226
+ choices=choices,
1227
+ value=None,
1228
+ info=f"{len(slides)} slide(s) available for re-analysis",
1229
+ )
1230
+
1231
+ except Exception as e:
1232
+ logger.error(f"Failed to load existing slides: {e}")
1233
+ return gr.Dropdown(choices=[], value=None)
1234
+
1235
+ # Handle slide upload and save to user storage
1236
+ def handle_slide_upload(files, slide_ids, request: gr.Request = None):
1237
+ """Save uploaded slides to user storage and track slide IDs."""
1238
+ if not files:
1239
+ return files, slide_ids or {}
1240
+
1241
+ # Determine username
1242
+ if IS_HF_SPACES:
1243
+ if not request or not request.username:
1244
+ # Not logged in on HF Spaces - don't save
1245
+ return files, slide_ids or {}
1246
+ username = request.username
1247
+ else:
1248
+ # Local mode - always use local_user for debugging
1249
+ from mosaic.ui.user_tabs import LOCAL_DEBUG_USERNAME
1250
+
1251
+ username = LOCAL_DEBUG_USERNAME
1252
+
1253
+ try:
1254
+ from mosaic.user_storage import save_uploaded_slide
1255
+
1256
+ new_slide_ids = slide_ids.copy() if slide_ids else {}
1257
+
1258
+ for file in files:
1259
+ try:
1260
+ # Save slide to user storage
1261
+ slide_id, _ = save_uploaded_slide(username, file)
1262
+
1263
+ # Track mapping from filename to slide_id
1264
+ filename = (
1265
+ Path(file.name).name if hasattr(file, "name") else str(file)
1266
+ )
1267
+ new_slide_ids[filename] = slide_id
1268
+
1269
+ logger.info(
1270
+ f"Saved slide {filename} with ID {slide_id} for user {username}"
1271
+ )
1272
+ except Exception as e:
1273
+ # Best effort - don't fail upload if storage fails
1274
+ logger.warning(
1275
+ f"Failed to save slide {file} to user storage: {e}"
1276
+ )
1277
+
1278
+ return files, new_slide_ids
1279
+
1280
+ except Exception as e:
1281
+ logger.error(f"Error in slide upload handler: {e}")
1282
+ # Don't fail the upload, just log the error
1283
+ return files, slide_ids or {}
1284
+
1285
  # Handle TCGA slide fetch
1286
  def fetch_tcga_slide(tcga_id, seg_config, progress=gr.Progress()):
1287
  """Fetch a slide from TCGA and pre-fill metadata."""
 
1411
  )
1412
 
1413
  # Handle file uploads - regenerate entire settings table
1414
+ # Save uploaded slides to user storage (HF Spaces only)
1415
+ input_slides.upload(
1416
+ handle_slide_upload,
1417
+ inputs=[input_slides, slide_ids_state],
1418
+ outputs=[input_slides, slide_ids_state],
1419
+ ).then(
1420
+ # Refresh storage usage after upload
1421
+ load_storage_usage,
1422
+ inputs=None,
1423
+ outputs=[storage_usage_warning, existing_file_row],
1424
+ ).then(
1425
+ # Refresh existing slides list after upload
1426
+ load_existing_slides,
1427
+ inputs=None,
1428
+ outputs=[existing_slides_dropdown],
1429
+ )
1430
+
1431
+ # Refresh existing slides dropdown
1432
+ refresh_existing_btn.click(
1433
+ load_existing_slides,
1434
+ inputs=None,
1435
+ outputs=[existing_slides_dropdown],
1436
+ )
1437
+
1438
+ # Handle selection of existing slide(s) for re-analysis
1439
+ def select_existing_slide(slide_ids, request: gr.Request = None):
1440
+ """When user selects existing slide(s), load them for analysis.
1441
+
1442
+ Args:
1443
+ slide_ids: Single slide_id (str) or list of slide_ids (multiselect)
1444
+ request: Gradio request object
1445
+
1446
+ Returns:
1447
+ gr.File update with selected slide(s)
1448
+ """
1449
+ if not slide_ids:
1450
+ return None
1451
+
1452
+ # Handle both single selection and multiselect
1453
+ if isinstance(slide_ids, str):
1454
+ slide_ids = [slide_ids]
1455
+
1456
+ from mosaic.ui.user_tabs import _get_username
1457
+
1458
+ username, _ = _get_username(request)
1459
+ if not username:
1460
+ return None
1461
+
1462
+ try:
1463
+ from mosaic.user_storage import get_slide_path, _load_slide_metadata
1464
+
1465
+ metadata = _load_slide_metadata(username)
1466
+ slide_paths = []
1467
+
1468
+ for slide_id in slide_ids:
1469
+ slide_path = get_slide_path(username, slide_id)
1470
+ if slide_path is None:
1471
+ logger.warning(
1472
+ f"Slide {slide_id} not found for user {username}"
1473
+ )
1474
+ continue
1475
+
1476
+ # Create a symlink with the original filename so the File
1477
+ # component displays the real name instead of the UUID storage name
1478
+ if slide_id in metadata:
1479
+ original_name = metadata[slide_id].original_filename
1480
+ if slide_path.name != original_name:
1481
+ tmp_dir = Path(tempfile.mkdtemp(prefix="mosaic_slide_"))
1482
+ symlink_path = tmp_dir / original_name
1483
+ symlink_path.symlink_to(slide_path)
1484
+ slide_paths.append(str(symlink_path))
1485
+ else:
1486
+ slide_paths.append(str(slide_path))
1487
+ else:
1488
+ slide_paths.append(str(slide_path))
1489
+
1490
+ if not slide_paths:
1491
+ return None
1492
+
1493
+ return gr.File(value=slide_paths)
1494
+
1495
+ except Exception as e:
1496
+ logger.error(f"Failed to load existing slides {slide_ids}: {e}")
1497
+ return None
1498
+
1499
+ existing_slides_dropdown.change(
1500
+ select_existing_slide,
1501
+ inputs=[existing_slides_dropdown],
1502
+ outputs=[input_slides],
1503
+ )
1504
+
1505
  @input_slides.change(
1506
  inputs=[
1507
  input_slides,
 
1527
  sex,
1528
  tissue_site,
1529
  cancer_subtype,
 
1530
  seg_config,
1531
  )
1532
  # if settings_df is not None:
 
1636
  ihc_subtype,
1637
  seg_config,
1638
  user_dir,
1639
+ slide_ids,
1640
  progress=gr.Progress(track_tqdm=True),
1641
  request: gr.Request = None,
1642
  profile: Optional[gr.OAuthProfile] = None,
 
1682
  ihc_subtype,
1683
  seg_config,
1684
  user_dir,
1685
+ slide_ids=slide_ids,
1686
  progress=progress,
1687
  request=request,
1688
  profile=profile,
 
1702
  ihc_subtype_dropdown,
1703
  seg_config_dropdown,
1704
  user_dir_state,
1705
+ slide_ids_state,
1706
  ],
1707
  outputs=[
1708
  settings_input,
 
1769
  outputs=[user_dir_state],
1770
  )
1771
 
1772
+ # Load storage usage on page load
1773
+ demo.load(
1774
+ load_storage_usage,
1775
+ inputs=None,
1776
+ outputs=[storage_usage_warning, existing_file_row],
1777
+ )
1778
+
1779
+ # Load existing slides dropdown on page load
1780
+ demo.load(
1781
+ load_existing_slides,
1782
+ inputs=None,
1783
+ outputs=[existing_slides_dropdown],
1784
+ )
1785
+
1786
+ # Add My Files and My Results tabs for user storage
1787
+ # On HF Spaces: visible for logged-in users
1788
+ # Locally: always visible with "local_user" for debugging
1789
+ gr.Markdown("---")
1790
+ if IS_HF_SPACES:
1791
+ gr.Markdown("## User Storage (Logged-In Users Only)")
1792
+ gr.Markdown(
1793
+ "The tabs below are only accessible to logged-in users. "
1794
+ "Upload and view your slides and analysis results."
1795
+ )
1796
+ else:
1797
+ gr.Markdown("## User Storage (Local Debug Mode)")
1798
+ gr.Markdown(
1799
+ "🔧 **Debug Mode**: Files stored as user 'local_user' in `/tmp/mosaic_user_data/local_user/`"
1800
+ )
1801
+
1802
+ with gr.Tabs():
1803
+ with gr.Tab("My Files"):
1804
+ from mosaic.ui.user_tabs import create_my_files_tab
1805
+
1806
+ my_files_components = create_my_files_tab()
1807
+
1808
+ # Load files on demo load
1809
+ demo.load(
1810
+ my_files_components["load_files"],
1811
+ inputs=None,
1812
+ outputs=[
1813
+ my_files_components["storage_usage"],
1814
+ my_files_components["files_table"],
1815
+ my_files_components["file_action_status"],
1816
+ my_files_components["slide_download_file"],
1817
+ ],
1818
+ )
1819
+
1820
+ with gr.Tab("My Results"):
1821
+ from mosaic.ui.user_tabs import create_my_results_tab
1822
+
1823
+ my_results_components = create_my_results_tab()
1824
+
1825
+ # Load results on demo load
1826
+ demo.load(
1827
+ my_results_components["load_results"],
1828
+ inputs=None,
1829
+ outputs=[
1830
+ my_results_components["results_table"],
1831
+ my_results_components["result_action_status"],
1832
+ my_results_components["result_download_file"],
1833
+ ],
1834
+ )
1835
+
1836
  # Use hardware-specific concurrency limit
1837
  # T4 GPUs (16GB) can only handle one analysis at a time to prevent OOM
1838
  # Higher-memory GPUs and ZeroGPU can handle multiple concurrent analyses
src/mosaic/ui/user_tabs.py ADDED
@@ -0,0 +1,600 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """My Files and My Results tabs for HuggingFace Spaces user storage.
2
+
3
+ This module provides UI components for browsing and managing user-uploaded
4
+ slides and analysis results in ephemeral storage.
5
+ """
6
+
7
+ import gradio as gr
8
+ import pandas as pd
9
+ from loguru import logger
10
+ from pathlib import Path
11
+
12
+ from mosaic.hardware import IS_HF_SPACES
13
+ from mosaic.user_storage import (
14
+ list_user_slides,
15
+ get_storage_usage,
16
+ delete_user_slide,
17
+ get_slide_path,
18
+ find_slide_by_name,
19
+ DEFAULT_QUOTA_BYTES,
20
+ )
21
+ from mosaic.user_results import (
22
+ list_user_results,
23
+ load_analysis_results,
24
+ delete_analysis_results,
25
+ create_results_zip,
26
+ )
27
+
28
+ # Local debug username for non-HF Spaces environments
29
+ LOCAL_DEBUG_USERNAME = "local_user"
30
+
31
+
32
+ def _get_username(request: gr.Request) -> tuple[str, bool]:
33
+ """Get username for storage operations.
34
+
35
+ Returns:
36
+ Tuple of (username, is_local_mode)
37
+ - In HF Spaces: (request.username, False) if logged in
38
+ - Locally: (LOCAL_DEBUG_USERNAME, True)
39
+ - HF Spaces not logged in: (None, False)
40
+ """
41
+ if IS_HF_SPACES:
42
+ # HF Spaces - use OAuth username if available
43
+ if request and request.username:
44
+ return (request.username, False)
45
+ return (None, False)
46
+ else:
47
+ # Local mode - use debug username
48
+ return (LOCAL_DEBUG_USERNAME, True)
49
+
50
+
51
+ def create_my_files_tab():
52
+ """Create the My Files tab UI for managing uploaded slides.
53
+
54
+ Returns:
55
+ Dictionary of UI components for event handlers
56
+ """
57
+ with gr.Column():
58
+ gr.Markdown("### My Uploaded Files")
59
+ gr.Markdown(
60
+ "View and manage your uploaded slides. Files are stored temporarily "
61
+ "and cleared when the instance restarts."
62
+ )
63
+
64
+ # Storage usage indicator
65
+ storage_usage_md = gr.Markdown("**Storage:** Loading...")
66
+
67
+ # Files table
68
+ files_table = gr.Dataframe(
69
+ headers=["Slide ID", "Filename", "Size (MB)", "Upload Date", "# Analyses"],
70
+ datatype=["str", "str", "number", "str", "number"],
71
+ label="Uploaded Slides",
72
+ interactive=False,
73
+ column_widths=["3fr", "3fr", "1fr", "2fr", "1fr"],
74
+ )
75
+
76
+ # Action buttons
77
+ with gr.Row():
78
+ refresh_files_btn = gr.Button("Refresh", variant="secondary")
79
+ selected_slide_id = gr.Textbox(
80
+ label="Selected Slide ID",
81
+ placeholder="Click a row in the table to select a slide",
82
+ )
83
+
84
+ with gr.Row():
85
+ download_slide_btn = gr.Button("Download Original", variant="secondary")
86
+ delete_slide_btn = gr.Button("Delete Slide", variant="stop")
87
+
88
+ # Status messages
89
+ file_action_status = gr.Textbox(
90
+ label="Status",
91
+ interactive=False,
92
+ visible=False,
93
+ )
94
+
95
+ # Download output
96
+ slide_download_file = gr.File(label="Download", visible=False)
97
+
98
+ def load_files(request: gr.Request):
99
+ """Load user's uploaded files."""
100
+ username, is_local = _get_username(request)
101
+
102
+ if not username:
103
+ return (
104
+ "**Storage:** Not logged in (login required on HF Spaces)",
105
+ [],
106
+ gr.Textbox(visible=False),
107
+ gr.File(visible=False),
108
+ )
109
+
110
+ try:
111
+ # Add debug indicator for local mode
112
+ mode_indicator = " 🔧 **[Local Debug Mode]**" if is_local else ""
113
+
114
+ # Get storage usage
115
+ usage = get_storage_usage(username)
116
+ quota_gb = usage.quota_bytes / (1024**3)
117
+ used_gb = usage.total_bytes / (1024**3)
118
+ percent = (
119
+ (usage.total_bytes / usage.quota_bytes * 100)
120
+ if usage.quota_bytes > 0
121
+ else 0
122
+ )
123
+
124
+ storage_md = (
125
+ f"**Storage:** {used_gb:.2f} GB / {quota_gb:.0f} GB ({percent:.1f}%) | "
126
+ f"{usage.file_count} files{mode_indicator}"
127
+ )
128
+
129
+ if percent > 80:
130
+ storage_md += " ⚠️ **Nearly full!**"
131
+
132
+ # Get files list
133
+ slides = list_user_slides(username)
134
+
135
+ if not slides:
136
+ return (
137
+ storage_md,
138
+ [],
139
+ gr.Textbox(visible=False),
140
+ gr.File(visible=False),
141
+ )
142
+
143
+ # Get result counts
144
+ from mosaic.user_results import list_user_results as list_results
145
+
146
+ files_data = []
147
+ for slide in slides:
148
+ # Count results for this slide
149
+ results = list_results(username, slide_id=slide.slide_id)
150
+
151
+ files_data.append(
152
+ [
153
+ Path(slide.original_filename).stem,
154
+ slide.original_filename,
155
+ round(slide.size_bytes / (1024**2), 2), # Convert to MB
156
+ slide.upload_time[:19], # Truncate to datetime
157
+ len(results),
158
+ ]
159
+ )
160
+
161
+ return (
162
+ storage_md,
163
+ files_data,
164
+ gr.Textbox(visible=False),
165
+ gr.File(visible=False),
166
+ )
167
+
168
+ except Exception as e:
169
+ logger.error(f"Failed to load user files: {e}")
170
+ return (
171
+ "**Storage:** Error loading files",
172
+ [],
173
+ gr.Textbox(value=f"Error: {str(e)}", visible=True),
174
+ gr.File(visible=False),
175
+ )
176
+
177
+ def download_slide(slide_id_or_name, request: gr.Request):
178
+ """Download the original slide file."""
179
+ username, _ = _get_username(request)
180
+
181
+ if not username or not slide_id_or_name:
182
+ return gr.File(visible=False), gr.Textbox(
183
+ value="Not logged in or no slide selected", visible=True
184
+ )
185
+
186
+ try:
187
+ # Resolve name/stem to UUID if needed
188
+ resolved_id = find_slide_by_name(username, slide_id_or_name)
189
+ if resolved_id is None:
190
+ return gr.File(visible=False), gr.Textbox(
191
+ value="Slide not found", visible=True
192
+ )
193
+
194
+ slide_path = get_slide_path(username, resolved_id)
195
+
196
+ if slide_path is None:
197
+ return gr.File(visible=False), gr.Textbox(
198
+ value="Slide not found", visible=True
199
+ )
200
+
201
+ return gr.File(value=str(slide_path), visible=True), gr.Textbox(
202
+ value="Download ready", visible=True
203
+ )
204
+
205
+ except Exception as e:
206
+ logger.error(f"Failed to download slide {slide_id_or_name}: {e}")
207
+ return gr.File(visible=False), gr.Textbox(
208
+ value=f"Error: {str(e)}", visible=True
209
+ )
210
+
211
+ def delete_slide(slide_id_or_name, request: gr.Request):
212
+ """Delete a slide and all associated results."""
213
+ username, _ = _get_username(request)
214
+
215
+ if not username or not slide_id_or_name:
216
+ return gr.Textbox(
217
+ value="Not logged in or no slide selected", visible=True
218
+ )
219
+
220
+ try:
221
+ # Resolve name/stem to UUID if needed
222
+ resolved_id = find_slide_by_name(username, slide_id_or_name)
223
+ if resolved_id is None:
224
+ return gr.Textbox(value="Slide not found", visible=True)
225
+
226
+ success = delete_user_slide(username, resolved_id)
227
+
228
+ if success:
229
+ return gr.Textbox(
230
+ value=f"Deleted slide '{slide_id_or_name}' and all associated results",
231
+ visible=True,
232
+ )
233
+ else:
234
+ return gr.Textbox(value="Slide not found", visible=True)
235
+
236
+ except Exception as e:
237
+ logger.error(f"Failed to delete slide {slide_id_or_name}: {e}")
238
+ return gr.Textbox(value=f"Error: {str(e)}", visible=True)
239
+
240
+ # Wire up events
241
+ refresh_files_btn.click(
242
+ load_files,
243
+ inputs=None,
244
+ outputs=[
245
+ storage_usage_md,
246
+ files_table,
247
+ file_action_status,
248
+ slide_download_file,
249
+ ],
250
+ )
251
+
252
+ download_slide_btn.click(
253
+ download_slide,
254
+ inputs=[selected_slide_id],
255
+ outputs=[slide_download_file, file_action_status],
256
+ )
257
+
258
+ delete_slide_btn.click(
259
+ delete_slide,
260
+ inputs=[selected_slide_id],
261
+ outputs=[file_action_status],
262
+ ).then(
263
+ load_files, # Refresh after delete
264
+ inputs=None,
265
+ outputs=[
266
+ storage_usage_md,
267
+ files_table,
268
+ file_action_status,
269
+ slide_download_file,
270
+ ],
271
+ )
272
+
273
+ def on_file_row_select(evt: gr.SelectData, table_data):
274
+ """Auto-fill slide ID when a row is clicked."""
275
+ if table_data is not None and len(table_data) > 0:
276
+ row = evt.index[0]
277
+ return str(table_data.iloc[row, 0]) # Slide ID column
278
+ return ""
279
+
280
+ files_table.select(
281
+ on_file_row_select,
282
+ inputs=[files_table],
283
+ outputs=[selected_slide_id],
284
+ )
285
+
286
+ return {
287
+ "storage_usage": storage_usage_md,
288
+ "files_table": files_table,
289
+ "refresh_btn": refresh_files_btn,
290
+ "load_files": load_files,
291
+ "file_action_status": file_action_status,
292
+ "slide_download_file": slide_download_file,
293
+ }
294
+
295
+
296
+ def create_my_results_tab():
297
+ """Create the My Results tab UI for browsing analysis results.
298
+
299
+ Returns:
300
+ Dictionary of UI components for event handlers
301
+ """
302
+ with gr.Column():
303
+ gr.Markdown("### My Analysis Results")
304
+ gr.Markdown(
305
+ "View and download results from previous analyses. "
306
+ "Results are stored temporarily and cleared when the instance restarts."
307
+ )
308
+
309
+ # Results table
310
+ results_table = gr.Dataframe(
311
+ headers=["Analysis ID", "Slide Name", "Date", "Cancer Subtype", "Settings"],
312
+ datatype=["str", "str", "str", "str", "str"],
313
+ label="Analysis Results",
314
+ interactive=False,
315
+ )
316
+
317
+ # Action buttons
318
+ with gr.Row():
319
+ refresh_results_btn = gr.Button("Refresh", variant="secondary")
320
+ selected_analysis_id = gr.Textbox(
321
+ label="Selected Analysis ID",
322
+ placeholder="Click a row in the table to select a result",
323
+ )
324
+
325
+ with gr.Row():
326
+ view_result_btn = gr.Button("View Details", variant="secondary")
327
+ download_zip_btn = gr.Button("Download ZIP", variant="primary")
328
+ delete_result_btn = gr.Button("Delete Result", variant="stop")
329
+
330
+ # Result detail section (hidden by default)
331
+ with gr.Accordion("Result Details", open=False) as result_details:
332
+ result_metadata_md = gr.Markdown("No result selected")
333
+ result_mask_img = gr.Image(label="Tissue Segmentation Mask", visible=False)
334
+ result_aeon_df = gr.Dataframe(
335
+ label="Cancer Subtype Predictions", visible=False
336
+ )
337
+ result_paladin_df = gr.Dataframe(
338
+ label="Biomarker Predictions", visible=False
339
+ )
340
+
341
+ # Status and download
342
+ result_action_status = gr.Textbox(
343
+ label="Status", interactive=False, visible=False
344
+ )
345
+ result_download_file = gr.File(label="Download", visible=False)
346
+
347
+ def load_results(request: gr.Request):
348
+ """Load user's analysis results."""
349
+ username, _ = _get_username(request)
350
+
351
+ if not username:
352
+ return [], gr.Textbox(visible=False), gr.File(visible=False)
353
+
354
+ try:
355
+ results = list_user_results(username)
356
+
357
+ if not results:
358
+ return [], gr.Textbox(visible=False), gr.File(visible=False)
359
+
360
+ results_data = []
361
+ for result in results:
362
+ # Format settings as short string
363
+ settings_str = f"{result.settings.get('site_type', 'N/A')} | {result.settings.get('sex', 'N/A')}"
364
+
365
+ results_data.append(
366
+ [
367
+ result.analysis_id,
368
+ result.slide_name,
369
+ result.timestamp[:19], # Truncate to datetime
370
+ result.cancer_subtype or "Unknown",
371
+ settings_str,
372
+ ]
373
+ )
374
+
375
+ return results_data, gr.Textbox(visible=False), gr.File(visible=False)
376
+
377
+ except Exception as e:
378
+ logger.error(f"Failed to load user results: {e}")
379
+ return (
380
+ [],
381
+ gr.Textbox(value=f"Error: {str(e)}", visible=True),
382
+ gr.File(visible=False),
383
+ )
384
+
385
+ def view_result_details(analysis_id, request: gr.Request):
386
+ """Load and display result details."""
387
+ username, _ = _get_username(request)
388
+
389
+ if not username or not analysis_id:
390
+ return (
391
+ gr.Accordion(open=False),
392
+ "No result selected",
393
+ gr.Image(visible=False),
394
+ gr.Dataframe(visible=False),
395
+ gr.Dataframe(visible=False),
396
+ gr.Textbox(visible=False),
397
+ )
398
+
399
+ try:
400
+ loaded = load_analysis_results(username, analysis_id)
401
+
402
+ if loaded is None:
403
+ return (
404
+ gr.Accordion(open=False),
405
+ "Result not found",
406
+ gr.Image(visible=False),
407
+ gr.Dataframe(visible=False),
408
+ gr.Dataframe(visible=False),
409
+ gr.Textbox(value="Result not found", visible=True),
410
+ )
411
+
412
+ metadata, mask, aeon_df, paladin_df = loaded
413
+
414
+ # Format aeon results with human-readable cancer subtype names
415
+ if aeon_df is not None and len(aeon_df) > 0:
416
+ from mosaic.ui.utils import get_oncotree_code_name
417
+
418
+ aeon_df = aeon_df.reset_index()
419
+ # Rename index column to "Cancer Subtype" if needed
420
+ if aeon_df.columns[0] != "Cancer Subtype":
421
+ aeon_df = aeon_df.rename(
422
+ columns={aeon_df.columns[0]: "Cancer Subtype"}
423
+ )
424
+ # Add human-readable names
425
+ aeon_df["Cancer Subtype"] = [
426
+ f"{get_oncotree_code_name(code)} ({code})"
427
+ for code in aeon_df["Cancer Subtype"]
428
+ ]
429
+ # Round confidence scores
430
+ for col in aeon_df.columns[1:]:
431
+ aeon_df[col] = pd.to_numeric(
432
+ aeon_df[col], errors="coerce"
433
+ ).round(3)
434
+
435
+ # Round paladin scores
436
+ if paladin_df is not None and "Score" in paladin_df.columns:
437
+ paladin_df["Score"] = pd.to_numeric(
438
+ paladin_df["Score"], errors="coerce"
439
+ ).round(3)
440
+
441
+ # Format metadata
442
+ settings_md = "### Analysis Settings\n\n"
443
+ for key, value in metadata.settings.items():
444
+ settings_md += f"- **{key}**: {value}\n"
445
+
446
+ # Format cancer subtype with readable name
447
+ subtype_display = metadata.cancer_subtype or "Unknown"
448
+ if subtype_display and subtype_display != "Unknown":
449
+ from mosaic.ui.utils import get_oncotree_code_name
450
+
451
+ name = get_oncotree_code_name(subtype_display)
452
+ if name and name != subtype_display:
453
+ subtype_display = f"{name} ({subtype_display})"
454
+
455
+ metadata_text = (
456
+ f"### {metadata.slide_name}\n\n"
457
+ f"**Analysis ID:** {metadata.analysis_id}\n\n"
458
+ f"**Date:** {metadata.timestamp[:19]}\n\n"
459
+ f"**Predicted Subtype:** {subtype_display}\n\n"
460
+ f"{settings_md}"
461
+ )
462
+
463
+ return (
464
+ gr.Accordion(open=True),
465
+ metadata_text,
466
+ (
467
+ gr.Image(value=mask, visible=True)
468
+ if mask
469
+ else gr.Image(visible=False)
470
+ ),
471
+ (
472
+ gr.Dataframe(value=aeon_df, visible=True)
473
+ if aeon_df is not None
474
+ else gr.Dataframe(visible=False)
475
+ ),
476
+ (
477
+ gr.Dataframe(value=paladin_df, visible=True)
478
+ if paladin_df is not None
479
+ else gr.Dataframe(visible=False)
480
+ ),
481
+ gr.Textbox(visible=False),
482
+ )
483
+
484
+ except Exception as e:
485
+ logger.error(f"Failed to load result details {analysis_id}: {e}")
486
+ return (
487
+ gr.Accordion(open=False),
488
+ "Error loading result",
489
+ gr.Image(visible=False),
490
+ gr.Dataframe(visible=False),
491
+ gr.Dataframe(visible=False),
492
+ gr.Textbox(value=f"Error: {str(e)}", visible=True),
493
+ )
494
+
495
+ def download_result_zip(analysis_id, request: gr.Request):
496
+ """Create and download ZIP of result files."""
497
+ username, _ = _get_username(request)
498
+
499
+ if not username or not analysis_id:
500
+ return gr.File(visible=False), gr.Textbox(
501
+ value="Not logged in or no result selected", visible=True
502
+ )
503
+
504
+ try:
505
+ zip_path = create_results_zip(username, analysis_id)
506
+
507
+ if zip_path is None:
508
+ return gr.File(visible=False), gr.Textbox(
509
+ value="Result not found", visible=True
510
+ )
511
+
512
+ return gr.File(value=str(zip_path), visible=True), gr.Textbox(
513
+ value="ZIP ready for download", visible=True
514
+ )
515
+
516
+ except Exception as e:
517
+ logger.error(f"Failed to create ZIP for {analysis_id}: {e}")
518
+ return gr.File(visible=False), gr.Textbox(
519
+ value=f"Error: {str(e)}", visible=True
520
+ )
521
+
522
+ def delete_result(analysis_id, request: gr.Request):
523
+ """Delete an analysis result."""
524
+ username, _ = _get_username(request)
525
+
526
+ if not username or not analysis_id:
527
+ return gr.Textbox(
528
+ value="Not logged in or no result selected", visible=True
529
+ )
530
+
531
+ try:
532
+ success = delete_analysis_results(username, analysis_id)
533
+
534
+ if success:
535
+ return gr.Textbox(
536
+ value=f"Deleted result {analysis_id}", visible=True
537
+ )
538
+ else:
539
+ return gr.Textbox(value="Result not found", visible=True)
540
+
541
+ except Exception as e:
542
+ logger.error(f"Failed to delete result {analysis_id}: {e}")
543
+ return gr.Textbox(value=f"Error: {str(e)}", visible=True)
544
+
545
+ # Wire up events
546
+ refresh_results_btn.click(
547
+ load_results,
548
+ inputs=None,
549
+ outputs=[results_table, result_action_status, result_download_file],
550
+ )
551
+
552
+ view_result_btn.click(
553
+ view_result_details,
554
+ inputs=[selected_analysis_id],
555
+ outputs=[
556
+ result_details,
557
+ result_metadata_md,
558
+ result_mask_img,
559
+ result_aeon_df,
560
+ result_paladin_df,
561
+ result_action_status,
562
+ ],
563
+ )
564
+
565
+ download_zip_btn.click(
566
+ download_result_zip,
567
+ inputs=[selected_analysis_id],
568
+ outputs=[result_download_file, result_action_status],
569
+ )
570
+
571
+ delete_result_btn.click(
572
+ delete_result,
573
+ inputs=[selected_analysis_id],
574
+ outputs=[result_action_status],
575
+ ).then(
576
+ load_results, # Refresh after delete
577
+ inputs=None,
578
+ outputs=[results_table, result_action_status, result_download_file],
579
+ )
580
+
581
+ def on_result_row_select(evt: gr.SelectData, table_data):
582
+ """Auto-fill analysis ID when a row is clicked."""
583
+ if table_data is not None and len(table_data) > 0:
584
+ row = evt.index[0]
585
+ return str(table_data.iloc[row, 0]) # Analysis ID column
586
+ return ""
587
+
588
+ results_table.select(
589
+ on_result_row_select,
590
+ inputs=[results_table],
591
+ outputs=[selected_analysis_id],
592
+ )
593
+
594
+ return {
595
+ "results_table": results_table,
596
+ "refresh_btn": refresh_results_btn,
597
+ "load_results": load_results,
598
+ "result_action_status": result_action_status,
599
+ "result_download_file": result_download_file,
600
+ }
src/mosaic/ui/utils.py CHANGED
@@ -27,7 +27,7 @@ SETTINGS_COLUMNS = [
27
  "Sex",
28
  "Tissue Site",
29
  "Cancer Subtype",
30
- "IHC Subtype",
31
  "Segmentation Config",
32
  ]
33
 
@@ -154,8 +154,8 @@ def load_settings(slide_csv_path):
154
  settings_df["Segmentation Config"] = "Biopsy"
155
  if "Cancer Subtype" not in settings_df.columns:
156
  settings_df["Cancer Subtype"] = "Unknown"
157
- if "IHC Subtype" not in settings_df.columns:
158
- settings_df["IHC Subtype"] = ""
159
  if "Tissue Site" not in settings_df.columns:
160
  settings_df["Tissue Site"] = "Unknown"
161
  if not set(SETTINGS_COLUMNS).issubset(settings_df.columns):
@@ -225,19 +225,20 @@ def validate_settings(
225
  f"Slide {slide_name}: Unknown tissue site. Valid tissue sites are: {', '.join(tissue_sites)}. "
226
  )
227
  settings_df.at[idx, "Tissue Site"] = "Unknown"
228
- if (
229
- "Breast" not in settings_df.at[idx, "Cancer Subtype"]
230
- and row["IHC Subtype"] != ""
231
- ):
232
- warnings.append(
233
- f"Slide {slide_name}: IHC subtype should be empty for non-breast cancer subtypes. "
234
- )
235
- settings_df.at[idx, "IHC Subtype"] = ""
236
- if row["IHC Subtype"] not in IHC_SUBTYPES:
237
- warnings.append(
238
- f"Slide {slide_name}: Unknown IHC subtype. Valid subtypes are: {', '.join(IHC_SUBTYPES)}. "
239
- )
240
- settings_df.at[idx, "IHC Subtype"] = ""
 
241
  if row["Segmentation Config"] not in ["Biopsy", "Resection", "TCGA"]:
242
  warnings.append(
243
  f"Slide {slide_name}: Unknown segmentation config. Valid configs are: Biopsy, Resection, TCGA. "
 
27
  "Sex",
28
  "Tissue Site",
29
  "Cancer Subtype",
30
+ # "IHC Subtype", # Not yet in use
31
  "Segmentation Config",
32
  ]
33
 
 
154
  settings_df["Segmentation Config"] = "Biopsy"
155
  if "Cancer Subtype" not in settings_df.columns:
156
  settings_df["Cancer Subtype"] = "Unknown"
157
+ # if "IHC Subtype" not in settings_df.columns: # Not yet in use
158
+ # settings_df["IHC Subtype"] = ""
159
  if "Tissue Site" not in settings_df.columns:
160
  settings_df["Tissue Site"] = "Unknown"
161
  if not set(SETTINGS_COLUMNS).issubset(settings_df.columns):
 
225
  f"Slide {slide_name}: Unknown tissue site. Valid tissue sites are: {', '.join(tissue_sites)}. "
226
  )
227
  settings_df.at[idx, "Tissue Site"] = "Unknown"
228
+ # IHC Subtype validation - not yet in use
229
+ # if (
230
+ # "Breast" not in settings_df.at[idx, "Cancer Subtype"]
231
+ # and row["IHC Subtype"] != ""
232
+ # ):
233
+ # warnings.append(
234
+ # f"Slide {slide_name}: IHC subtype should be empty for non-breast cancer subtypes. "
235
+ # )
236
+ # settings_df.at[idx, "IHC Subtype"] = ""
237
+ # if row["IHC Subtype"] not in IHC_SUBTYPES:
238
+ # warnings.append(
239
+ # f"Slide {slide_name}: Unknown IHC subtype. Valid subtypes are: {', '.join(IHC_SUBTYPES)}. "
240
+ # )
241
+ # settings_df.at[idx, "IHC Subtype"] = ""
242
  if row["Segmentation Config"] not in ["Biopsy", "Resection", "TCGA"]:
243
  warnings.append(
244
  f"Slide {slide_name}: Unknown segmentation config. Valid configs are: Biopsy, Resection, TCGA. "
src/mosaic/user_results.py ADDED
@@ -0,0 +1,328 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """User analysis results management for HuggingFace Spaces.
2
+
3
+ This module provides ephemeral storage for analysis results, linked to uploaded
4
+ slides. Results are cleared on instance restart (no persistence).
5
+
6
+ Storage structure:
7
+ /tmp/mosaic_user_data/{username}/
8
+ results/
9
+ {analysis_id}/
10
+ metadata.json # Result metadata
11
+ slide_mask.png # Tissue segmentation
12
+ aeon_results.csv # Cancer subtype predictions
13
+ paladin_results.csv # Biomarker predictions
14
+ """
15
+
16
+ import json
17
+ import shutil
18
+ import zipfile
19
+ from dataclasses import asdict, dataclass
20
+ from datetime import datetime
21
+ from pathlib import Path
22
+ from typing import Optional
23
+
24
+ import pandas as pd
25
+ from loguru import logger
26
+ from PIL import Image
27
+
28
+ from mosaic.user_storage import get_user_storage_dir
29
+
30
+
31
+ @dataclass
32
+ class ResultMetadata:
33
+ """Metadata for an analysis result.
34
+
35
+ Attributes:
36
+ analysis_id: Unique identifier for this analysis
37
+ slide_id: ID of the slide that was analyzed
38
+ slide_name: Original slide filename
39
+ timestamp: ISO timestamp of analysis
40
+ settings: Dict of analysis settings (site_type, cancer_subtype, etc.)
41
+ cancer_subtype: Predicted cancer subtype (for quick display)
42
+ """
43
+
44
+ analysis_id: str
45
+ slide_id: str
46
+ slide_name: str
47
+ timestamp: str
48
+ settings: dict
49
+ cancer_subtype: Optional[str] = None
50
+
51
+
52
+ def _get_results_dir(username: str, analysis_id: str) -> Path:
53
+ """Get the directory for a specific analysis result.
54
+
55
+ Args:
56
+ username: Username
57
+ analysis_id: Analysis ID
58
+
59
+ Returns:
60
+ Path to results directory
61
+ """
62
+ user_dir = get_user_storage_dir(username)
63
+ return user_dir / "results" / analysis_id
64
+
65
+
66
+ def save_analysis_results(
67
+ username: str,
68
+ analysis_id: str,
69
+ slide_id: str,
70
+ slide_name: str,
71
+ settings: dict,
72
+ aeon_results: Optional[pd.DataFrame],
73
+ paladin_results: Optional[pd.DataFrame],
74
+ slide_mask: Optional[Image.Image],
75
+ ) -> bool:
76
+ """Save analysis results for a user.
77
+
78
+ Best-effort operation: logs errors but doesn't raise exceptions.
79
+
80
+ Args:
81
+ username: Username
82
+ analysis_id: Unique analysis ID
83
+ slide_id: ID of the analyzed slide
84
+ slide_name: Original slide filename
85
+ settings: Dict of analysis settings
86
+ aeon_results: Aeon predictions DataFrame
87
+ paladin_results: Paladin predictions DataFrame
88
+ slide_mask: Tissue segmentation mask
89
+
90
+ Returns:
91
+ True if saved successfully, False otherwise
92
+ """
93
+ try:
94
+ results_dir = _get_results_dir(username, analysis_id)
95
+ results_dir.mkdir(parents=True, exist_ok=True)
96
+
97
+ # Extract cancer subtype from aeon results (top prediction)
98
+ cancer_subtype = None
99
+ if aeon_results is not None and len(aeon_results) > 0:
100
+ # Aeon results are sorted by confidence, first row is top prediction
101
+ cancer_subtype = (
102
+ aeon_results.index[0] if hasattr(aeon_results, "index") else None
103
+ )
104
+
105
+ # Save metadata
106
+ metadata = ResultMetadata(
107
+ analysis_id=analysis_id,
108
+ slide_id=slide_id,
109
+ slide_name=slide_name,
110
+ timestamp=datetime.utcnow().isoformat(),
111
+ settings=settings,
112
+ cancer_subtype=cancer_subtype,
113
+ )
114
+ metadata_path = results_dir / "metadata.json"
115
+ with open(metadata_path, "w") as f:
116
+ json.dump(asdict(metadata), f, indent=2)
117
+
118
+ # Save aeon results (with index - "Cancer Subtype")
119
+ if aeon_results is not None:
120
+ aeon_path = results_dir / "aeon_results.csv"
121
+ aeon_results.to_csv(aeon_path, index=True)
122
+
123
+ # Save paladin results (no index)
124
+ if paladin_results is not None:
125
+ paladin_path = results_dir / "paladin_results.csv"
126
+ paladin_results.to_csv(paladin_path, index=False)
127
+
128
+ # Save slide mask
129
+ if slide_mask is not None:
130
+ mask_path = results_dir / "slide_mask.png"
131
+ slide_mask.save(mask_path)
132
+
133
+ logger.info(
134
+ f"Saved analysis results for user {username}, "
135
+ f"analysis {analysis_id}, slide {slide_id}"
136
+ )
137
+ return True
138
+
139
+ except Exception as e:
140
+ logger.error(
141
+ f"Failed to save analysis results for user {username}, "
142
+ f"analysis {analysis_id}: {e}"
143
+ )
144
+ return False
145
+
146
+
147
+ def load_analysis_results(
148
+ username: str, analysis_id: str
149
+ ) -> Optional[
150
+ tuple[ResultMetadata, Image.Image | None, pd.DataFrame | None, pd.DataFrame | None]
151
+ ]:
152
+ """Load analysis results for a user.
153
+
154
+ Args:
155
+ username: Username
156
+ analysis_id: Analysis ID
157
+
158
+ Returns:
159
+ Tuple of (metadata, slide_mask, aeon_results, paladin_results) or None if not found
160
+ """
161
+ try:
162
+ results_dir = _get_results_dir(username, analysis_id)
163
+ if not results_dir.exists():
164
+ return None
165
+
166
+ # Load metadata
167
+ metadata_path = results_dir / "metadata.json"
168
+ if not metadata_path.exists():
169
+ return None
170
+
171
+ with open(metadata_path, "r") as f:
172
+ metadata = ResultMetadata(**json.load(f))
173
+
174
+ # Load mask
175
+ mask_path = results_dir / "slide_mask.png"
176
+ slide_mask = Image.open(mask_path) if mask_path.exists() else None
177
+
178
+ # Load aeon results (restore index)
179
+ aeon_path = results_dir / "aeon_results.csv"
180
+ aeon_results = (
181
+ pd.read_csv(aeon_path, index_col=0) if aeon_path.exists() else None
182
+ )
183
+
184
+ # Load paladin results
185
+ paladin_path = results_dir / "paladin_results.csv"
186
+ paladin_results = pd.read_csv(paladin_path) if paladin_path.exists() else None
187
+
188
+ return metadata, slide_mask, aeon_results, paladin_results
189
+
190
+ except Exception as e:
191
+ logger.error(
192
+ f"Failed to load analysis results for user {username}, "
193
+ f"analysis {analysis_id}: {e}"
194
+ )
195
+ return None
196
+
197
+
198
+ def list_user_results(
199
+ username: str, slide_id: Optional[str] = None
200
+ ) -> list[ResultMetadata]:
201
+ """List all analysis results for a user.
202
+
203
+ Returns results sorted by timestamp (newest first).
204
+
205
+ Args:
206
+ username: Username
207
+ slide_id: Optional slide ID to filter by
208
+
209
+ Returns:
210
+ List of ResultMetadata objects
211
+ """
212
+ user_dir = get_user_storage_dir(username)
213
+ results_base = user_dir / "results"
214
+
215
+ if not results_base.exists():
216
+ return []
217
+
218
+ results = []
219
+ for result_dir in results_base.iterdir():
220
+ if not result_dir.is_dir():
221
+ continue
222
+
223
+ metadata_path = result_dir / "metadata.json"
224
+ if not metadata_path.exists():
225
+ continue
226
+
227
+ try:
228
+ with open(metadata_path, "r") as f:
229
+ metadata = ResultMetadata(**json.load(f))
230
+
231
+ # Filter by slide_id if provided
232
+ if slide_id is not None and metadata.slide_id != slide_id:
233
+ continue
234
+
235
+ results.append(metadata)
236
+ except Exception as e:
237
+ logger.warning(f"Failed to load metadata from {result_dir}: {e}")
238
+ continue
239
+
240
+ # Sort by timestamp, newest first
241
+ results.sort(key=lambda r: r.timestamp, reverse=True)
242
+ return results
243
+
244
+
245
+ def delete_analysis_results(username: str, analysis_id: str) -> bool:
246
+ """Delete analysis results.
247
+
248
+ Args:
249
+ username: Username
250
+ analysis_id: Analysis ID
251
+
252
+ Returns:
253
+ True if deleted successfully, False if not found
254
+ """
255
+ try:
256
+ results_dir = _get_results_dir(username, analysis_id)
257
+ if not results_dir.exists():
258
+ return False
259
+
260
+ shutil.rmtree(results_dir)
261
+ logger.info(f"Deleted analysis results {analysis_id} for user {username}")
262
+ return True
263
+
264
+ except Exception as e:
265
+ logger.error(
266
+ f"Failed to delete analysis results {analysis_id} for user {username}: {e}"
267
+ )
268
+ return False
269
+
270
+
271
+ def delete_results_for_slide(username: str, slide_id: str) -> int:
272
+ """Delete all analysis results for a specific slide.
273
+
274
+ Args:
275
+ username: Username
276
+ slide_id: Slide ID
277
+
278
+ Returns:
279
+ Number of results deleted
280
+ """
281
+ results = list_user_results(username, slide_id=slide_id)
282
+ deleted_count = 0
283
+
284
+ for result in results:
285
+ if delete_analysis_results(username, result.analysis_id):
286
+ deleted_count += 1
287
+
288
+ if deleted_count > 0:
289
+ logger.info(
290
+ f"Deleted {deleted_count} analysis results for slide {slide_id}, "
291
+ f"user {username}"
292
+ )
293
+
294
+ return deleted_count
295
+
296
+
297
+ def create_results_zip(username: str, analysis_id: str) -> Optional[Path]:
298
+ """Create a ZIP file containing all result files.
299
+
300
+ Args:
301
+ username: Username
302
+ analysis_id: Analysis ID
303
+
304
+ Returns:
305
+ Path to ZIP file, or None if failed
306
+ """
307
+ try:
308
+ results_dir = _get_results_dir(username, analysis_id)
309
+ if not results_dir.exists():
310
+ return None
311
+
312
+ # Create ZIP in parent directory
313
+ zip_path = results_dir.parent / f"{analysis_id}_results.zip"
314
+
315
+ with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as zipf:
316
+ # Add all files in results directory
317
+ for file_path in results_dir.iterdir():
318
+ if file_path.is_file():
319
+ zipf.write(file_path, arcname=file_path.name)
320
+
321
+ logger.info(f"Created results ZIP for {analysis_id}, user {username}")
322
+ return zip_path
323
+
324
+ except Exception as e:
325
+ logger.error(
326
+ f"Failed to create results ZIP for {analysis_id}, user {username}: {e}"
327
+ )
328
+ return None
src/mosaic/user_storage.py ADDED
@@ -0,0 +1,447 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """User storage management for HuggingFace Spaces.
2
+
3
+ This module provides ephemeral storage for uploaded slide files, enforcing
4
+ per-user quotas and providing cleanup utilities. Storage is cleared on
5
+ instance restart (no persistence).
6
+
7
+ Storage structure:
8
+ /tmp/mosaic_user_data/{username}/
9
+ slides/
10
+ {slide_id}.{ext} # Uploaded slide files
11
+ metadata.json # Slide metadata
12
+ storage_info.json # Usage tracking
13
+ """
14
+
15
+ import json
16
+ import os
17
+ import shutil
18
+ import uuid
19
+ from dataclasses import asdict, dataclass
20
+ from datetime import datetime
21
+ from pathlib import Path
22
+ from typing import Optional
23
+
24
+ from loguru import logger
25
+
26
+ # Default per-user storage quota (5GB)
27
+ DEFAULT_QUOTA_GB = int(os.environ.get("MOSAIC_USER_STORAGE_QUOTA_GB", "5"))
28
+ DEFAULT_QUOTA_BYTES = DEFAULT_QUOTA_GB * 1024 * 1024 * 1024
29
+
30
+ # Base directory for user storage (ephemeral)
31
+ USER_STORAGE_BASE = Path("/tmp/mosaic_user_data")
32
+
33
+
34
+ @dataclass
35
+ class SlideMetadata:
36
+ """Metadata for an uploaded slide.
37
+
38
+ Attributes:
39
+ slide_id: Unique identifier for the slide
40
+ original_filename: Original filename when uploaded
41
+ upload_time: ISO timestamp of upload
42
+ size_bytes: File size in bytes
43
+ file_extension: File extension (e.g., '.svs', '.tif')
44
+ """
45
+
46
+ slide_id: str
47
+ original_filename: str
48
+ upload_time: str
49
+ size_bytes: int
50
+ file_extension: str
51
+
52
+
53
+ @dataclass
54
+ class StorageInfo:
55
+ """Storage usage information for a user.
56
+
57
+ Attributes:
58
+ total_bytes: Total bytes used by all files
59
+ file_count: Number of slide files stored
60
+ quota_bytes: Storage quota in bytes
61
+ last_cleanup: ISO timestamp of last cleanup (or None)
62
+ """
63
+
64
+ total_bytes: int
65
+ file_count: int
66
+ quota_bytes: int
67
+ last_cleanup: Optional[str] = None
68
+
69
+
70
+ def get_user_storage_dir(username: str) -> Path:
71
+ """Get the storage directory for a user.
72
+
73
+ Creates the directory structure if it doesn't exist:
74
+ /tmp/mosaic_user_data/{username}/
75
+ slides/
76
+ results/
77
+
78
+ Args:
79
+ username: Username from HF Spaces OAuth
80
+
81
+ Returns:
82
+ Path to user's storage directory
83
+ """
84
+ user_dir = USER_STORAGE_BASE / username
85
+ slides_dir = user_dir / "slides"
86
+ results_dir = user_dir / "results"
87
+
88
+ slides_dir.mkdir(parents=True, exist_ok=True)
89
+ results_dir.mkdir(parents=True, exist_ok=True)
90
+
91
+ return user_dir
92
+
93
+
94
+ def _load_slide_metadata(username: str) -> dict[str, SlideMetadata]:
95
+ """Load slide metadata from disk.
96
+
97
+ Args:
98
+ username: Username
99
+
100
+ Returns:
101
+ Dict mapping slide_id to SlideMetadata
102
+ """
103
+ user_dir = get_user_storage_dir(username)
104
+ metadata_path = user_dir / "slides" / "metadata.json"
105
+
106
+ if not metadata_path.exists():
107
+ return {}
108
+
109
+ try:
110
+ with open(metadata_path, "r") as f:
111
+ data = json.load(f)
112
+ return {k: SlideMetadata(**v) for k, v in data.items()}
113
+ except Exception as e:
114
+ logger.error(f"Failed to load slide metadata for {username}: {e}")
115
+ return {}
116
+
117
+
118
+ def _save_slide_metadata(username: str, metadata: dict[str, SlideMetadata]) -> None:
119
+ """Save slide metadata to disk.
120
+
121
+ Args:
122
+ username: Username
123
+ metadata: Dict mapping slide_id to SlideMetadata
124
+ """
125
+ user_dir = get_user_storage_dir(username)
126
+ metadata_path = user_dir / "slides" / "metadata.json"
127
+
128
+ try:
129
+ data = {k: asdict(v) for k, v in metadata.items()}
130
+ with open(metadata_path, "w") as f:
131
+ json.dump(data, f, indent=2)
132
+ except Exception as e:
133
+ logger.error(f"Failed to save slide metadata for {username}: {e}")
134
+
135
+
136
+ def _load_storage_info(username: str) -> StorageInfo:
137
+ """Load storage info from disk.
138
+
139
+ Args:
140
+ username: Username
141
+
142
+ Returns:
143
+ StorageInfo object
144
+ """
145
+ user_dir = get_user_storage_dir(username)
146
+ info_path = user_dir / "storage_info.json"
147
+
148
+ if not info_path.exists():
149
+ return StorageInfo(total_bytes=0, file_count=0, quota_bytes=DEFAULT_QUOTA_BYTES)
150
+
151
+ try:
152
+ with open(info_path, "r") as f:
153
+ data = json.load(f)
154
+ return StorageInfo(**data)
155
+ except Exception as e:
156
+ logger.error(f"Failed to load storage info for {username}: {e}")
157
+ return StorageInfo(total_bytes=0, file_count=0, quota_bytes=DEFAULT_QUOTA_BYTES)
158
+
159
+
160
+ def _save_storage_info(username: str, info: StorageInfo) -> None:
161
+ """Save storage info to disk.
162
+
163
+ Args:
164
+ username: Username
165
+ info: StorageInfo object
166
+ """
167
+ user_dir = get_user_storage_dir(username)
168
+ info_path = user_dir / "storage_info.json"
169
+
170
+ try:
171
+ with open(info_path, "w") as f:
172
+ json.dump(asdict(info), f, indent=2)
173
+ except Exception as e:
174
+ logger.error(f"Failed to save storage info for {username}: {e}")
175
+
176
+
177
+ def get_storage_usage(username: str) -> StorageInfo:
178
+ """Calculate current storage usage for a user.
179
+
180
+ Recalculates from actual files to ensure accuracy.
181
+
182
+ Args:
183
+ username: Username
184
+
185
+ Returns:
186
+ StorageInfo with current usage
187
+ """
188
+ user_dir = get_user_storage_dir(username)
189
+ slides_dir = user_dir / "slides"
190
+
191
+ total_bytes = 0
192
+ file_count = 0
193
+
194
+ # Calculate from actual files
195
+ for slide_file in slides_dir.glob("*"):
196
+ if slide_file.is_file() and slide_file.name != "metadata.json":
197
+ total_bytes += slide_file.stat().st_size
198
+ file_count += 1
199
+
200
+ info = _load_storage_info(username)
201
+ info.total_bytes = total_bytes
202
+ info.file_count = file_count
203
+
204
+ _save_storage_info(username, info)
205
+ return info
206
+
207
+
208
+ def save_uploaded_slide(
209
+ username: str, slide_path: str | Path, quota_bytes: Optional[int] = None
210
+ ) -> tuple[str, str]:
211
+ """Save an uploaded slide to user storage.
212
+
213
+ Enforces per-user quota. If quota would be exceeded, raises ValueError.
214
+
215
+ Args:
216
+ username: Username
217
+ slide_path: Path to the uploaded slide file
218
+ quota_bytes: Optional override for quota (uses default if None)
219
+
220
+ Returns:
221
+ Tuple of (slide_id, saved_path)
222
+
223
+ Raises:
224
+ ValueError: If quota would be exceeded
225
+ OSError: If file operations fail
226
+ """
227
+ slide_path = Path(slide_path)
228
+ file_size = slide_path.stat().st_size
229
+ file_extension = slide_path.suffix
230
+
231
+ # Check quota
232
+ quota = quota_bytes or DEFAULT_QUOTA_BYTES
233
+ current_usage = get_storage_usage(username)
234
+
235
+ if current_usage.total_bytes + file_size > quota:
236
+ quota_gb = quota / (1024**3)
237
+ current_gb = current_usage.total_bytes / (1024**3)
238
+ raise ValueError(
239
+ f"Storage quota exceeded. Used: {current_gb:.2f} GB / {quota_gb:.2f} GB. "
240
+ f"Delete old files before uploading."
241
+ )
242
+
243
+ # Generate unique slide ID
244
+ slide_id = str(uuid.uuid4())
245
+ user_dir = get_user_storage_dir(username)
246
+ dest_path = user_dir / "slides" / f"{slide_id}{file_extension}"
247
+
248
+ # Copy file
249
+ shutil.copy2(slide_path, dest_path)
250
+ logger.info(f"Saved slide {slide_id} for user {username} ({file_size} bytes)")
251
+
252
+ # Update metadata
253
+ metadata = _load_slide_metadata(username)
254
+ metadata[slide_id] = SlideMetadata(
255
+ slide_id=slide_id,
256
+ original_filename=slide_path.name,
257
+ upload_time=datetime.utcnow().isoformat(),
258
+ size_bytes=file_size,
259
+ file_extension=file_extension,
260
+ )
261
+ _save_slide_metadata(username, metadata)
262
+
263
+ # Update storage info
264
+ current_usage.total_bytes += file_size
265
+ current_usage.file_count += 1
266
+ _save_storage_info(username, current_usage)
267
+
268
+ return slide_id, str(dest_path)
269
+
270
+
271
+ def list_user_slides(username: str) -> list[SlideMetadata]:
272
+ """List all uploaded slides for a user.
273
+
274
+ Returns slides sorted by upload time (newest first).
275
+
276
+ Args:
277
+ username: Username
278
+
279
+ Returns:
280
+ List of SlideMetadata objects
281
+ """
282
+ metadata = _load_slide_metadata(username)
283
+ slides = list(metadata.values())
284
+ # Sort by upload time, newest first
285
+ slides.sort(key=lambda s: s.upload_time, reverse=True)
286
+ return slides
287
+
288
+
289
+ def get_slide_path(username: str, slide_id: str) -> Optional[Path]:
290
+ """Get the file path for a slide.
291
+
292
+ Args:
293
+ username: Username
294
+ slide_id: Slide ID
295
+
296
+ Returns:
297
+ Path to slide file, or None if not found
298
+ """
299
+ metadata = _load_slide_metadata(username)
300
+ if slide_id not in metadata:
301
+ return None
302
+
303
+ user_dir = get_user_storage_dir(username)
304
+ slide_meta = metadata[slide_id]
305
+ slide_path = user_dir / "slides" / f"{slide_id}{slide_meta.file_extension}"
306
+
307
+ if slide_path.exists():
308
+ return slide_path
309
+ return None
310
+
311
+
312
+ def delete_user_slide(username: str, slide_id: str) -> bool:
313
+ """Delete a slide and all associated results.
314
+
315
+ Args:
316
+ username: Username
317
+ slide_id: Slide ID to delete
318
+
319
+ Returns:
320
+ True if deleted successfully, False if not found
321
+ """
322
+ metadata = _load_slide_metadata(username)
323
+ if slide_id not in metadata:
324
+ logger.warning(f"Slide {slide_id} not found for user {username}")
325
+ return False
326
+
327
+ user_dir = get_user_storage_dir(username)
328
+ slide_meta = metadata[slide_id]
329
+ slide_path = user_dir / "slides" / f"{slide_id}{slide_meta.file_extension}"
330
+
331
+ # Delete the file
332
+ try:
333
+ if slide_path.exists():
334
+ slide_path.unlink()
335
+ logger.info(f"Deleted slide {slide_id} for user {username}")
336
+ except Exception as e:
337
+ logger.error(f"Failed to delete slide file {slide_id}: {e}")
338
+ return False
339
+
340
+ # Delete associated results (import here to avoid circular dependency)
341
+ try:
342
+ from mosaic.user_results import delete_results_for_slide
343
+
344
+ delete_results_for_slide(username, slide_id)
345
+ except Exception as e:
346
+ logger.warning(f"Failed to delete results for slide {slide_id}: {e}")
347
+
348
+ # Update metadata
349
+ del metadata[slide_id]
350
+ _save_slide_metadata(username, metadata)
351
+
352
+ # Update storage info
353
+ current_usage = get_storage_usage(username)
354
+ _save_storage_info(username, current_usage)
355
+
356
+ return True
357
+
358
+
359
+ def find_slide_by_name(username: str, name: str) -> Optional[str]:
360
+ """Find a slide UUID by original filename or filename stem.
361
+
362
+ Searches for an exact filename match first, then tries stem match,
363
+ then falls back to treating the input as a UUID.
364
+
365
+ Args:
366
+ username: Username
367
+ name: Original filename, filename stem, or slide UUID
368
+
369
+ Returns:
370
+ slide_id (UUID) if found, None otherwise
371
+ """
372
+ metadata = _load_slide_metadata(username)
373
+
374
+ # Try exact filename match
375
+ for slide_id, meta in metadata.items():
376
+ if meta.original_filename == name:
377
+ return slide_id
378
+
379
+ # Try stem match
380
+ for slide_id, meta in metadata.items():
381
+ if Path(meta.original_filename).stem == name:
382
+ return slide_id
383
+
384
+ # Try as UUID (backward compat)
385
+ if name in metadata:
386
+ return name
387
+
388
+ return None
389
+
390
+
391
+ def cleanup_old_files(username: str, target_bytes: int) -> int:
392
+ """Remove oldest files until storage is under target.
393
+
394
+ Uses FIFO strategy (oldest files deleted first).
395
+
396
+ Args:
397
+ username: Username
398
+ target_bytes: Target storage size in bytes
399
+
400
+ Returns:
401
+ Number of files deleted
402
+ """
403
+ current_usage = get_storage_usage(username)
404
+ if current_usage.total_bytes <= target_bytes:
405
+ return 0
406
+
407
+ slides = list_user_slides(username)
408
+ # Sort by upload time, oldest first
409
+ slides.sort(key=lambda s: s.upload_time)
410
+
411
+ deleted_count = 0
412
+ current_bytes = current_usage.total_bytes
413
+
414
+ for slide in slides:
415
+ if current_bytes <= target_bytes:
416
+ break
417
+
418
+ delete_user_slide(username, slide.slide_id)
419
+ current_bytes -= slide.size_bytes
420
+ deleted_count += 1
421
+
422
+ # Update last cleanup time
423
+ info = _load_storage_info(username)
424
+ info.last_cleanup = datetime.utcnow().isoformat()
425
+ _save_storage_info(username, info)
426
+
427
+ logger.info(
428
+ f"Cleanup for user {username}: deleted {deleted_count} files, "
429
+ f"now using {current_bytes / (1024**3):.2f} GB"
430
+ )
431
+
432
+ return deleted_count
433
+
434
+
435
+ def check_disk_space() -> tuple[int, int, float]:
436
+ """Check global disk space on the instance.
437
+
438
+ Returns:
439
+ Tuple of (used_bytes, total_bytes, percent_used)
440
+ """
441
+ try:
442
+ stat = shutil.disk_usage("/tmp")
443
+ percent_used = (stat.used / stat.total) * 100
444
+ return stat.used, stat.total, percent_used
445
+ except Exception as e:
446
+ logger.error(f"Failed to check disk space: {e}")
447
+ return 0, 0, 0.0
tests/benchmark_batch_performance.py CHANGED
@@ -219,7 +219,7 @@ def main():
219
  "Sex": ["Unknown"] * len(slides),
220
  "Tissue Site": ["Unknown"] * len(slides),
221
  "Cancer Subtype": ["Unknown"] * len(slides),
222
- "IHC Subtype": [""] * len(slides),
223
  "Segmentation Config": ["Biopsy"] * len(slides),
224
  }
225
  )
 
219
  "Sex": ["Unknown"] * len(slides),
220
  "Tissue Site": ["Unknown"] * len(slides),
221
  "Cancer Subtype": ["Unknown"] * len(slides),
222
+ # "IHC Subtype": [""] * len(slides),
223
  "Segmentation Config": ["Biopsy"] * len(slides),
224
  }
225
  )
tests/test_cli.py CHANGED
@@ -74,7 +74,7 @@ class TestArgumentParsing:
74
  "Sex": ["Unknown"],
75
  "Tissue Site": ["Unknown"],
76
  "Cancer Subtype": ["Unknown"],
77
- "IHC Subtype": [""],
78
  "Segmentation Config": ["Biopsy"],
79
  }
80
  )
 
74
  "Sex": ["Unknown"],
75
  "Tissue Site": ["Unknown"],
76
  "Cancer Subtype": ["Unknown"],
77
+ # "IHC Subtype": [""], # Not yet in use
78
  "Segmentation Config": ["Biopsy"],
79
  }
80
  )
tests/test_fixtures.py CHANGED
@@ -90,7 +90,7 @@ def sample_settings_df():
90
  "Sex": ["Male", "Female", "Male"],
91
  "Tissue Site": ["Lung", "Liver", "Unknown"],
92
  "Cancer Subtype": ["Unknown", "Lung Adenocarcinoma (LUAD)", "Unknown"],
93
- "IHC Subtype": ["", "", ""],
94
  "Segmentation Config": ["Biopsy", "Resection", "TCGA"],
95
  }
96
  )
@@ -112,7 +112,7 @@ def create_settings_df(n_rows, **kwargs):
112
  "Sex": ["Unknown"] * n_rows,
113
  "Tissue Site": ["Unknown"] * n_rows,
114
  "Cancer Subtype": ["Unknown"] * n_rows,
115
- "IHC Subtype": [""] * n_rows,
116
  "Segmentation Config": ["Biopsy"] * n_rows,
117
  }
118
 
@@ -137,14 +137,12 @@ def create_settings_df(n_rows, **kwargs):
137
  def sample_csv_valid():
138
  """Temporary CSV file with valid settings."""
139
  with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
 
 
140
  f.write(
141
- "Slide,Site Type,Sex,Tissue Site,Cancer Subtype,IHC Subtype,Segmentation Config\n"
142
  )
143
- f.write("slide1.svs,Primary,Unknown,Lung,Unknown,,Biopsy\n")
144
- f.write(
145
- "slide2.svs,Metastatic,Female,Liver,Lung Adenocarcinoma (LUAD),,Resection\n"
146
- )
147
- f.write("slide3.svs,Primary,Male,Unknown,Unknown,,TCGA\n")
148
  f.flush()
149
  yield f.name
150
  Path(f.name).unlink(missing_ok=True)
@@ -154,11 +152,9 @@ def sample_csv_valid():
154
  def sample_csv_invalid():
155
  """Temporary CSV file with invalid values (for validation testing)."""
156
  with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
 
157
  f.write(
158
- "Slide,Site Type,Sex,Tissue Site,Cancer Subtype,IHC Subtype,Segmentation Config\n"
159
- )
160
- f.write(
161
- "slide1.svs,InvalidSite,InvalidSex,InvalidTissue,InvalidSubtype,InvalidIHC,InvalidConfig\n"
162
  )
163
  f.write(
164
  "slide2.svs,Primary,Unknown,Lung,BRCA,HR+/HER2+,Biopsy\n"
 
90
  "Sex": ["Male", "Female", "Male"],
91
  "Tissue Site": ["Lung", "Liver", "Unknown"],
92
  "Cancer Subtype": ["Unknown", "Lung Adenocarcinoma (LUAD)", "Unknown"],
93
+ # "IHC Subtype": ["", "", ""], # Not yet in use
94
  "Segmentation Config": ["Biopsy", "Resection", "TCGA"],
95
  }
96
  )
 
112
  "Sex": ["Unknown"] * n_rows,
113
  "Tissue Site": ["Unknown"] * n_rows,
114
  "Cancer Subtype": ["Unknown"] * n_rows,
115
+ # "IHC Subtype": [""] * n_rows, # Not yet in use
116
  "Segmentation Config": ["Biopsy"] * n_rows,
117
  }
118
 
 
137
  def sample_csv_valid():
138
  """Temporary CSV file with valid settings."""
139
  with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
140
+ f.write("Slide,Site Type,Sex,Tissue Site,Cancer Subtype,Segmentation Config\n")
141
+ f.write("slide1.svs,Primary,Unknown,Lung,Unknown,Biopsy\n")
142
  f.write(
143
+ "slide2.svs,Metastatic,Female,Liver,Lung Adenocarcinoma (LUAD),Resection\n"
144
  )
145
+ f.write("slide3.svs,Primary,Male,Unknown,Unknown,TCGA\n")
 
 
 
 
146
  f.flush()
147
  yield f.name
148
  Path(f.name).unlink(missing_ok=True)
 
152
  def sample_csv_invalid():
153
  """Temporary CSV file with invalid values (for validation testing)."""
154
  with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
155
+ f.write("Slide,Site Type,Sex,Tissue Site,Cancer Subtype,Segmentation Config\n")
156
  f.write(
157
+ "slide1.svs,InvalidSite,InvalidSex,InvalidTissue,InvalidSubtype,InvalidConfig\n"
 
 
 
158
  )
159
  f.write(
160
  "slide2.svs,Primary,Unknown,Lung,BRCA,HR+/HER2+,Biopsy\n"
tests/test_gradio_app.py CHANGED
@@ -48,7 +48,7 @@ class TestConstants:
48
  "Slide",
49
  "Site Type",
50
  "Cancer Subtype",
51
- "IHC Subtype",
52
  "Segmentation Config",
53
  ]
54
  for field in required_fields:
@@ -81,11 +81,9 @@ class TestLoadSettings:
81
  def temp_settings_csv(self):
82
  """Create a temporary settings CSV file with all columns."""
83
  with tempfile.NamedTemporaryFile(mode="w", delete=False, suffix=".csv") as f:
84
- f.write(
85
- "Slide,Site Type,Sex,Cancer Subtype,IHC Subtype,Segmentation Config\n"
86
- )
87
- f.write("slide1.svs,Primary,Male,Unknown,,Biopsy\n")
88
- f.write("slide2.svs,Metastatic,Female,Unknown,,Resection\n")
89
  temp_path = f.name
90
  yield temp_path
91
  Path(temp_path).unlink()
@@ -117,10 +115,10 @@ class TestLoadSettings:
117
  df = load_settings(temp_minimal_settings_csv)
118
  assert "Segmentation Config" in df.columns
119
  assert "Cancer Subtype" in df.columns
120
- assert "IHC Subtype" in df.columns
121
  assert df["Segmentation Config"].iloc[0] == "Biopsy"
122
  assert df["Cancer Subtype"].iloc[0] == "Unknown"
123
- assert df["IHC Subtype"].iloc[0] == ""
124
 
125
  def test_load_settings_preserves_data(self, temp_settings_csv):
126
  """Test that data is preserved correctly."""
 
48
  "Slide",
49
  "Site Type",
50
  "Cancer Subtype",
51
+ # "IHC Subtype", # Not yet in use
52
  "Segmentation Config",
53
  ]
54
  for field in required_fields:
 
81
  def temp_settings_csv(self):
82
  """Create a temporary settings CSV file with all columns."""
83
  with tempfile.NamedTemporaryFile(mode="w", delete=False, suffix=".csv") as f:
84
+ f.write("Slide,Site Type,Sex,Cancer Subtype,Segmentation Config\n")
85
+ f.write("slide1.svs,Primary,Male,Unknown,Biopsy\n")
86
+ f.write("slide2.svs,Metastatic,Female,Unknown,Resection\n")
 
 
87
  temp_path = f.name
88
  yield temp_path
89
  Path(temp_path).unlink()
 
115
  df = load_settings(temp_minimal_settings_csv)
116
  assert "Segmentation Config" in df.columns
117
  assert "Cancer Subtype" in df.columns
118
+ # assert "IHC Subtype" in df.columns # Not yet in use
119
  assert df["Segmentation Config"].iloc[0] == "Biopsy"
120
  assert df["Cancer Subtype"].iloc[0] == "Unknown"
121
+ # assert df["IHC Subtype"].iloc[0] == "" # Not yet in use
122
 
123
  def test_load_settings_preserves_data(self, temp_settings_csv):
124
  """Test that data is preserved correctly."""
tests/test_regression_single_slide.py CHANGED
@@ -143,7 +143,7 @@ class TestSingleSlideRegression:
143
  "Sex": ["Male"],
144
  "Tissue Site": ["Lung"],
145
  "Cancer Subtype": ["Unknown"],
146
- "IHC Subtype": [""],
147
  "Segmentation Config": ["Biopsy"],
148
  }
149
  )
 
143
  "Sex": ["Male"],
144
  "Tissue Site": ["Lung"],
145
  "Cancer Subtype": ["Unknown"],
146
+ # "IHC Subtype": [""],
147
  "Segmentation Config": ["Biopsy"],
148
  }
149
  )
tests/test_settings_upload.py CHANGED
@@ -234,11 +234,9 @@ class TestCsvFormatEdgeCases:
234
  from mosaic.ui.utils import load_settings
235
 
236
  with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
237
- f.write(
238
- "Slide,Site Type,Sex,Cancer Subtype,IHC Subtype,Segmentation Config\n"
239
- )
240
  # Empty values for optional columns
241
- f.write("slide1.svs,Primary,Male,Unknown,,\n")
242
  f.flush()
243
  temp_path = f.name
244
 
@@ -250,7 +248,7 @@ class TestCsvFormatEdgeCases:
250
  assert len(df) == 1
251
  # Empty strings should be preserved (validation will handle defaults later)
252
  assert df["Segmentation Config"].iloc[0] == ""
253
- assert df["IHC Subtype"].iloc[0] == ""
254
  finally:
255
  Path(temp_path).unlink(missing_ok=True)
256
 
 
234
  from mosaic.ui.utils import load_settings
235
 
236
  with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
237
+ f.write("Slide,Site Type,Sex,Cancer Subtype,Segmentation Config\n")
 
 
238
  # Empty values for optional columns
239
+ f.write("slide1.svs,Primary,Male,Unknown,\n")
240
  f.flush()
241
  temp_path = f.name
242
 
 
248
  assert len(df) == 1
249
  # Empty strings should be preserved (validation will handle defaults later)
250
  assert df["Segmentation Config"].iloc[0] == ""
251
+ # assert df["IHC Subtype"].iloc[0] == "" # Not yet in use
252
  finally:
253
  Path(temp_path).unlink(missing_ok=True)
254
 
tests/test_ui_components.py CHANGED
@@ -40,7 +40,7 @@ class TestSettingsValidation:
40
  "Sex": ["Male"],
41
  "Tissue Site": ["Unknown"],
42
  "Cancer Subtype": ["InvalidSubtype"],
43
- "IHC Subtype": [""],
44
  "Segmentation Config": ["Biopsy"],
45
  }
46
  )
@@ -72,7 +72,7 @@ class TestSettingsValidation:
72
  "Sex": ["Male"],
73
  "Tissue Site": ["Unknown"],
74
  "Cancer Subtype": ["Unknown"],
75
- "IHC Subtype": [""],
76
  "Segmentation Config": ["Biopsy"],
77
  }
78
  )
@@ -117,7 +117,7 @@ class TestAnalysisWorkflow:
117
  "Sex": ["Male"],
118
  "Tissue Site": ["Unknown"],
119
  "Cancer Subtype": ["Unknown"],
120
- "IHC Subtype": [""],
121
  "Segmentation Config": ["Biopsy"],
122
  }
123
  )
@@ -197,7 +197,7 @@ class TestAnalysisWorkflow:
197
  "Sex": ["Male", "Female", "Male"],
198
  "Tissue Site": ["Unknown", "Unknown", "Unknown"],
199
  "Cancer Subtype": ["Unknown", "Unknown", "Unknown"],
200
- "IHC Subtype": ["", "", ""],
201
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
202
  }
203
  )
 
40
  "Sex": ["Male"],
41
  "Tissue Site": ["Unknown"],
42
  "Cancer Subtype": ["InvalidSubtype"],
43
+ # "IHC Subtype": [""],
44
  "Segmentation Config": ["Biopsy"],
45
  }
46
  )
 
72
  "Sex": ["Male"],
73
  "Tissue Site": ["Unknown"],
74
  "Cancer Subtype": ["Unknown"],
75
+ # "IHC Subtype": [""],
76
  "Segmentation Config": ["Biopsy"],
77
  }
78
  )
 
117
  "Sex": ["Male"],
118
  "Tissue Site": ["Unknown"],
119
  "Cancer Subtype": ["Unknown"],
120
+ # "IHC Subtype": [""],
121
  "Segmentation Config": ["Biopsy"],
122
  }
123
  )
 
197
  "Sex": ["Male", "Female", "Male"],
198
  "Tissue Site": ["Unknown", "Unknown", "Unknown"],
199
  "Cancer Subtype": ["Unknown", "Unknown", "Unknown"],
200
+ # "IHC Subtype": ["", "", ""],
201
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
202
  }
203
  )
tests/test_ui_events.py CHANGED
@@ -72,7 +72,7 @@ class TestGeneratorBehavior:
72
  "Sex": ["Male"],
73
  "Tissue Site": ["Unknown"],
74
  "Cancer Subtype": ["Unknown"],
75
- "IHC Subtype": [""],
76
  "Segmentation Config": ["Biopsy"],
77
  }
78
  )
@@ -141,7 +141,7 @@ class TestGeneratorBehavior:
141
  "Sex": ["Male", "Female", "Male"],
142
  "Tissue Site": ["Unknown", "Unknown", "Unknown"],
143
  "Cancer Subtype": ["Unknown", "Unknown", "Unknown"],
144
- "IHC Subtype": ["", "", ""],
145
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
146
  }
147
  )
@@ -237,7 +237,7 @@ class TestGeneratorBehavior:
237
  "Sex": ["Male", "Female", "Male"],
238
  "Tissue Site": ["Unknown", "Unknown", "Unknown"],
239
  "Cancer Subtype": ["Unknown", "Unknown", "Unknown"],
240
- "IHC Subtype": ["", "", ""],
241
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
242
  }
243
  )
@@ -337,7 +337,7 @@ class TestGeneratorBehavior:
337
  "Sex": ["Male", "Female", "Male"],
338
  "Tissue Site": ["Lung", "Lung", "Lung"],
339
  "Cancer Subtype": ["LUAD", "LUAD", "LUAD"],
340
- "IHC Subtype": ["", "", ""],
341
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
342
  }
343
  )
@@ -418,7 +418,7 @@ class TestErrorDisplay:
418
  "Sex": ["Male", "InvalidSex"],
419
  "Tissue Site": ["Unknown", "Unknown"],
420
  "Cancer Subtype": ["InvalidSubtype", "Unknown"],
421
- "IHC Subtype": ["", ""],
422
  "Segmentation Config": ["Biopsy", "InvalidConfig"],
423
  }
424
  )
@@ -614,7 +614,7 @@ class TestTCGASegmentationConfig:
614
  "Sex": ["Male", "Female", "Male"],
615
  "Tissue Site": ["Lung", "Breast", "Liver"],
616
  "Cancer Subtype": ["LUAD", "BRCA", "LIHC"],
617
- "IHC Subtype": ["", "", ""],
618
  "Segmentation Config": ["Biopsy", "Resection", "Biopsy"],
619
  }
620
  )
 
72
  "Sex": ["Male"],
73
  "Tissue Site": ["Unknown"],
74
  "Cancer Subtype": ["Unknown"],
75
+ # "IHC Subtype": [""],
76
  "Segmentation Config": ["Biopsy"],
77
  }
78
  )
 
141
  "Sex": ["Male", "Female", "Male"],
142
  "Tissue Site": ["Unknown", "Unknown", "Unknown"],
143
  "Cancer Subtype": ["Unknown", "Unknown", "Unknown"],
144
+ # "IHC Subtype": ["", "", ""],
145
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
146
  }
147
  )
 
237
  "Sex": ["Male", "Female", "Male"],
238
  "Tissue Site": ["Unknown", "Unknown", "Unknown"],
239
  "Cancer Subtype": ["Unknown", "Unknown", "Unknown"],
240
+ # "IHC Subtype": ["", "", ""],
241
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
242
  }
243
  )
 
337
  "Sex": ["Male", "Female", "Male"],
338
  "Tissue Site": ["Lung", "Lung", "Lung"],
339
  "Cancer Subtype": ["LUAD", "LUAD", "LUAD"],
340
+ # "IHC Subtype": ["", "", ""],
341
  "Segmentation Config": ["Biopsy", "Biopsy", "Biopsy"],
342
  }
343
  )
 
418
  "Sex": ["Male", "InvalidSex"],
419
  "Tissue Site": ["Unknown", "Unknown"],
420
  "Cancer Subtype": ["InvalidSubtype", "Unknown"],
421
+ # "IHC Subtype": ["", ""],
422
  "Segmentation Config": ["Biopsy", "InvalidConfig"],
423
  }
424
  )
 
614
  "Sex": ["Male", "Female", "Male"],
615
  "Tissue Site": ["Lung", "Breast", "Liver"],
616
  "Cancer Subtype": ["LUAD", "BRCA", "LIHC"],
617
+ # "IHC Subtype": ["", "", ""],
618
  "Segmentation Config": ["Biopsy", "Resection", "Biopsy"],
619
  }
620
  )
tests/test_ui_user_storage.py ADDED
@@ -0,0 +1,428 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Integration tests for user storage management in Gradio UI.
2
+
3
+ Tests end-to-end workflows including upload, analysis, viewing, and deletion
4
+ of slides and results through the UI components.
5
+ """
6
+
7
+ import os
8
+ import tempfile
9
+ from pathlib import Path
10
+ from unittest.mock import MagicMock, patch
11
+
12
+ import pandas as pd
13
+ import pytest
14
+
15
+ from mosaic.ui.app import launch_gradio
16
+ from mosaic.ui.user_tabs import (
17
+ LOCAL_DEBUG_USERNAME,
18
+ _get_username,
19
+ create_my_files_tab,
20
+ create_my_results_tab,
21
+ )
22
+ from mosaic.user_results import save_analysis_results
23
+ from mosaic.user_storage import save_uploaded_slide
24
+
25
+
26
+ @pytest.fixture
27
+ def temp_storage_dir(tmp_path):
28
+ """Create temporary storage directory for testing."""
29
+ storage_dir = tmp_path / "mosaic_user_data"
30
+ storage_dir.mkdir()
31
+
32
+ # Mock the storage base directory
33
+ # Note: user_results imports from user_storage, so only mock user_storage
34
+ with patch("mosaic.user_storage.USER_STORAGE_BASE", storage_dir):
35
+ yield storage_dir
36
+
37
+
38
+ @pytest.fixture
39
+ def mock_request_local():
40
+ """Mock Gradio request for local mode (non-HF Spaces)."""
41
+ request = MagicMock()
42
+ request.username = None
43
+ request.session_hash = "test_session"
44
+ return request
45
+
46
+
47
+ @pytest.fixture
48
+ def mock_request_hf_logged_in():
49
+ """Mock Gradio request for HF Spaces logged-in user."""
50
+ request = MagicMock()
51
+ request.username = "test_user"
52
+ request.session_hash = "test_session"
53
+ return request
54
+
55
+
56
+ @pytest.fixture
57
+ def mock_request_hf_not_logged_in():
58
+ """Mock Gradio request for HF Spaces not logged in."""
59
+ request = MagicMock()
60
+ request.username = None
61
+ request.session_hash = "test_session"
62
+ return request
63
+
64
+
65
+ @pytest.fixture
66
+ def sample_slide_file(tmp_path):
67
+ """Create a sample slide file for testing."""
68
+ slide_file = tmp_path / "sample_slide.svs"
69
+ slide_file.write_text("fake slide data" * 1000) # ~15KB
70
+
71
+ # Return the actual Path object (save_uploaded_slide expects a file-like object)
72
+ # We need to mock it properly
73
+
74
+ return MockFile(slide_file)
75
+
76
+
77
+ class TestUsernameExtraction:
78
+ """Test username extraction in different modes."""
79
+
80
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", False)
81
+ def test_local_mode_returns_debug_username(self, mock_request_local):
82
+ """Local mode should return LOCAL_DEBUG_USERNAME."""
83
+ username, is_local = _get_username(mock_request_local)
84
+ assert username == LOCAL_DEBUG_USERNAME
85
+ assert is_local is True
86
+
87
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", True)
88
+ def test_hf_logged_in_returns_username(self, mock_request_hf_logged_in):
89
+ """HF Spaces logged in should return request username."""
90
+ username, is_local = _get_username(mock_request_hf_logged_in)
91
+ assert username == "test_user"
92
+ assert is_local is False
93
+
94
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", True)
95
+ def test_hf_not_logged_in_returns_none(self, mock_request_hf_not_logged_in):
96
+ """HF Spaces not logged in should return None."""
97
+ username, is_local = _get_username(mock_request_hf_not_logged_in)
98
+ assert username is None
99
+ assert is_local is False
100
+
101
+
102
+ class TestEndToEndUserStorage:
103
+ """Test complete user storage workflow."""
104
+
105
+ def test_upload_analyze_view_delete_workflow(
106
+ self, temp_storage_dir, mock_request_local
107
+ ):
108
+ """Test complete workflow: upload → analyze → view → delete."""
109
+ username = LOCAL_DEBUG_USERNAME
110
+
111
+ # Create a slide file
112
+ temp_file = temp_storage_dir / "sample_slide.svs"
113
+ temp_file.write_text("fake slide data" * 1000)
114
+
115
+ # Step 1: Upload slide (use Path directly, not MockFile)
116
+ slide_id, _ = save_uploaded_slide(username, temp_file)
117
+ assert slide_id is not None
118
+
119
+ # Verify slide saved
120
+ from mosaic.user_storage import list_user_slides
121
+
122
+ slides = list_user_slides(username)
123
+ assert len(slides) == 1
124
+ assert slides[0].slide_id == slide_id
125
+
126
+ # Verify slide saved
127
+ from mosaic.user_storage import list_user_slides
128
+
129
+ slides = list_user_slides(username)
130
+ assert len(slides) == 1
131
+ assert slides[0].slide_id == slide_id
132
+
133
+ # Step 2: Save analysis results
134
+ settings = {
135
+ "seg_config": "Biopsy",
136
+ "site_type": "Primary",
137
+ "sex": "Male",
138
+ "tissue_site": "Lung",
139
+ "cancer_subtype": "LUAD",
140
+ }
141
+
142
+ aeon_df = pd.DataFrame(
143
+ {
144
+ "cancer_type": ["LUAD", "LUSC"],
145
+ "confidence": [0.85, 0.10],
146
+ }
147
+ )
148
+
149
+ analysis_id = "test_analysis_001"
150
+ save_analysis_results(
151
+ username=username,
152
+ analysis_id=analysis_id,
153
+ slide_id=slide_id,
154
+ slide_name="sample_slide.svs",
155
+ settings=settings,
156
+ aeon_results=aeon_df,
157
+ paladin_results=None,
158
+ slide_mask=None,
159
+ )
160
+
161
+ # Step 3: Verify results saved
162
+ from mosaic.user_results import list_user_results, load_analysis_results
163
+
164
+ results = list_user_results(username)
165
+ assert len(results) == 1
166
+ assert results[0].analysis_id == analysis_id
167
+ assert results[0].slide_id == slide_id
168
+
169
+ # Step 4: Load result details
170
+ loaded = load_analysis_results(username, analysis_id)
171
+ assert loaded is not None
172
+ metadata, mask, aeon, paladin = loaded
173
+ assert metadata.slide_name == "sample_slide.svs"
174
+ assert aeon is not None
175
+ assert len(aeon) == 2
176
+
177
+ # Step 5: Delete slide (should also delete results)
178
+ from mosaic.user_storage import delete_user_slide
179
+
180
+ success = delete_user_slide(username, slide_id)
181
+ assert success is True
182
+
183
+ # Verify slide deleted
184
+ slides = list_user_slides(username)
185
+ assert len(slides) == 0
186
+
187
+ # Verify results also deleted
188
+ results = list_user_results(username, slide_id=slide_id)
189
+ assert len(results) == 0
190
+
191
+ def test_quota_enforcement_and_cleanup(self, temp_storage_dir, mock_request_local):
192
+ """Test quota enforcement triggers FIFO cleanup."""
193
+ username = LOCAL_DEBUG_USERNAME
194
+
195
+ # Set very small quota (30KB)
196
+ with patch("mosaic.user_storage.DEFAULT_QUOTA_BYTES", 30 * 1024):
197
+ # Upload multiple files (each ~10KB)
198
+ slide_ids = []
199
+ for i in range(5):
200
+ temp_file = temp_storage_dir / f"slide_{i}.svs"
201
+ temp_file.write_text("fake data " * 1000) # ~10KB each
202
+
203
+ try:
204
+ slide_id, _ = save_uploaded_slide(username, temp_file)
205
+ slide_ids.append(slide_id)
206
+ except ValueError:
207
+ # Quota exceeded, cleanup should have happened
208
+ pass
209
+
210
+ # Should have some slides (older ones cleaned up)
211
+ from mosaic.user_storage import list_user_slides, get_storage_usage
212
+
213
+ slides = list_user_slides(username)
214
+ usage = get_storage_usage(username)
215
+
216
+ # Total should be under quota after cleanup
217
+ assert usage.total_bytes <= 30 * 1024
218
+ # Some files should have been cleaned up (only 3 should fit)
219
+ assert len(slides) <= 3
220
+
221
+ def test_multiple_analyses_same_slide(self, temp_storage_dir, mock_request_local):
222
+ """Test multiple analyses can reference the same slide."""
223
+ username = LOCAL_DEBUG_USERNAME
224
+
225
+ # Upload slide once
226
+ temp_file = temp_storage_dir / "multi_analysis.svs"
227
+ temp_file.write_text("fake slide data" * 1000)
228
+
229
+ slide_id, _ = save_uploaded_slide(username, temp_file)
230
+
231
+ # Save multiple analysis results for same slide
232
+ settings1 = {"seg_config": "Biopsy", "site_type": "Primary", "sex": "Male"}
233
+ settings2 = {"seg_config": "Resection", "site_type": "Primary", "sex": "Male"}
234
+
235
+ save_analysis_results(
236
+ username=username,
237
+ analysis_id="analysis_001",
238
+ slide_id=slide_id,
239
+ slide_name="sample.svs",
240
+ settings=settings1,
241
+ aeon_results=None,
242
+ paladin_results=None,
243
+ slide_mask=None,
244
+ )
245
+
246
+ save_analysis_results(
247
+ username=username,
248
+ analysis_id="analysis_002",
249
+ slide_id=slide_id,
250
+ slide_name="sample.svs",
251
+ settings=settings2,
252
+ aeon_results=None,
253
+ paladin_results=None,
254
+ slide_mask=None,
255
+ )
256
+
257
+ # Verify both results saved
258
+ from mosaic.user_results import list_user_results
259
+
260
+ results = list_user_results(username, slide_id=slide_id)
261
+ assert len(results) == 2
262
+
263
+ # Delete slide should delete both results
264
+ from mosaic.user_storage import delete_user_slide
265
+
266
+ delete_user_slide(username, slide_id)
267
+
268
+ results = list_user_results(username, slide_id=slide_id)
269
+ assert len(results) == 0
270
+
271
+
272
+ class TestMyFilesTabIntegration:
273
+ """Test My Files tab UI integration."""
274
+
275
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", False)
276
+ def test_load_files_local_mode(self, temp_storage_dir, mock_request_local):
277
+ """Test loading files in local debug mode."""
278
+ username = LOCAL_DEBUG_USERNAME
279
+
280
+ # Upload a slide first
281
+ temp_file = temp_storage_dir / "sample_slide.svs"
282
+ temp_file.write_text("fake slide data" * 1000)
283
+
284
+ save_uploaded_slide(username, temp_file)
285
+
286
+ # Test the loading logic
287
+ from mosaic.ui.user_tabs import _get_username
288
+ from mosaic.user_storage import list_user_slides, get_storage_usage
289
+
290
+ username_returned, is_local = _get_username(mock_request_local)
291
+ assert username_returned == LOCAL_DEBUG_USERNAME
292
+ assert is_local is True
293
+
294
+ # Verify slides can be listed
295
+ slides = list_user_slides(username)
296
+ assert len(slides) == 1
297
+ assert slides[0].original_filename == "sample_slide.svs"
298
+
299
+ # Verify storage usage
300
+ usage = get_storage_usage(username)
301
+ assert usage.file_count == 1
302
+
303
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", True)
304
+ def test_load_files_hf_not_logged_in(self, mock_request_hf_not_logged_in):
305
+ """Test loading files when not logged in on HF Spaces."""
306
+ # Import the load_files logic directly (would be in create_my_files_tab)
307
+ from mosaic.ui.user_tabs import _get_username
308
+ from mosaic.user_storage import list_user_slides, get_storage_usage
309
+
310
+ username, _ = _get_username(mock_request_hf_not_logged_in)
311
+
312
+ # Should return None for not logged in
313
+ assert username is None
314
+
315
+ # Verify behavior would show "not logged in" message
316
+ # (actual function would return empty data)
317
+
318
+
319
+ class TestMyResultsTabIntegration:
320
+ """Test My Results tab UI integration."""
321
+
322
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", False)
323
+ def test_load_results_local_mode(self, temp_storage_dir, mock_request_local):
324
+ """Test loading results in local debug mode."""
325
+ username = LOCAL_DEBUG_USERNAME
326
+
327
+ # Create analysis result
328
+ temp_file = temp_storage_dir / "sample_slide.svs"
329
+ temp_file.write_text("fake slide data" * 1000)
330
+
331
+ slide_id, _ = save_uploaded_slide(username, temp_file)
332
+ save_analysis_results(
333
+ username=username,
334
+ analysis_id="test_001",
335
+ slide_id=slide_id,
336
+ slide_name="sample.svs",
337
+ settings={"sex": "Male", "site_type": "Primary"},
338
+ aeon_results=None,
339
+ paladin_results=None,
340
+ slide_mask=None,
341
+ )
342
+
343
+ # Test the loading logic
344
+ from mosaic.ui.user_tabs import _get_username
345
+ from mosaic.user_results import list_user_results
346
+
347
+ username_returned, _ = _get_username(mock_request_local)
348
+ assert username_returned == LOCAL_DEBUG_USERNAME
349
+
350
+ # Verify results can be listed
351
+ results = list_user_results(username)
352
+ assert len(results) == 1
353
+ assert results[0].analysis_id == "test_001"
354
+ assert results[0].slide_name == "sample.svs"
355
+
356
+
357
+ class TestStorageWarningsUI:
358
+ """Test storage warning display in main analysis tab."""
359
+
360
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", False)
361
+ def test_normal_storage_usage(self, temp_storage_dir, mock_request_local):
362
+ """Test normal storage usage display (< 80%)."""
363
+ # Import the function (would normally be in app.py)
364
+ from mosaic.ui.user_tabs import _get_username
365
+ from mosaic.user_storage import get_storage_usage
366
+
367
+ username, _ = _get_username(mock_request_local)
368
+ usage = get_storage_usage(username)
369
+
370
+ # Should be 0% initially
371
+ percent = (
372
+ (usage.total_bytes / usage.quota_bytes * 100)
373
+ if usage.quota_bytes > 0
374
+ else 0
375
+ )
376
+ assert percent == 0
377
+
378
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", False)
379
+ def test_warning_threshold_storage(self, temp_storage_dir, mock_request_local):
380
+ """Test storage warning at 80% threshold."""
381
+ username = LOCAL_DEBUG_USERNAME
382
+
383
+ # Set quota to force warning threshold
384
+ with patch("mosaic.user_storage.DEFAULT_QUOTA_BYTES", 18 * 1024): # 18KB quota
385
+ # Upload file to reach warning threshold
386
+ temp_file = temp_storage_dir / "large_slide.svs"
387
+ temp_file.write_text("large fake data" * 1000) # ~16KB
388
+
389
+ save_uploaded_slide(username, temp_file)
390
+
391
+ from mosaic.user_storage import get_storage_usage
392
+
393
+ usage = get_storage_usage(username)
394
+ percent = usage.total_bytes / usage.quota_bytes * 100
395
+
396
+ # Should be > 80%
397
+ assert percent > 80
398
+
399
+
400
+ class TestReAnalysisWorkflow:
401
+ """Test re-analyzing existing slides without re-uploading."""
402
+
403
+ @patch("mosaic.ui.user_tabs.IS_HF_SPACES", False)
404
+ def test_select_existing_slide_for_analysis(
405
+ self, temp_storage_dir, mock_request_local
406
+ ):
407
+ """Test selecting an existing slide from dropdown."""
408
+ username = LOCAL_DEBUG_USERNAME
409
+
410
+ # Upload slide
411
+ temp_file = temp_storage_dir / "sample_slide.svs"
412
+ temp_file.write_text("fake slide data" * 1000)
413
+
414
+ slide_id, _ = save_uploaded_slide(username, temp_file)
415
+
416
+ # Get slide path (simulating dropdown selection)
417
+ from mosaic.user_storage import get_slide_path
418
+
419
+ slide_path = get_slide_path(username, slide_id)
420
+ assert slide_path is not None
421
+ assert slide_path.exists()
422
+
423
+ # Slide can now be analyzed without re-upload
424
+ # (In actual UI, this would populate input_slides component)
425
+
426
+
427
+ if __name__ == "__main__":
428
+ pytest.main([__file__, "-v"])
tests/test_user_results.py ADDED
@@ -0,0 +1,347 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Unit tests for user_results module."""
2
+
3
+ import tempfile
4
+ from datetime import datetime
5
+ from pathlib import Path
6
+
7
+ import pandas as pd
8
+ import pytest
9
+ from PIL import Image
10
+
11
+ from mosaic.user_results import (
12
+ ResultMetadata,
13
+ create_results_zip,
14
+ delete_analysis_results,
15
+ delete_results_for_slide,
16
+ list_user_results,
17
+ load_analysis_results,
18
+ save_analysis_results,
19
+ )
20
+
21
+
22
+ @pytest.fixture
23
+ def temp_storage_base(tmp_path, monkeypatch):
24
+ """Temporarily override storage base to tmp_path."""
25
+ monkeypatch.setattr("mosaic.user_storage.USER_STORAGE_BASE", tmp_path)
26
+ return tmp_path
27
+
28
+
29
+ @pytest.fixture
30
+ def test_user():
31
+ """Test username."""
32
+ return "test_user"
33
+
34
+
35
+ @pytest.fixture
36
+ def test_analysis_id():
37
+ """Test analysis ID."""
38
+ return "analysis-123"
39
+
40
+
41
+ @pytest.fixture
42
+ def test_slide_id():
43
+ """Test slide ID."""
44
+ return "slide-456"
45
+
46
+
47
+ @pytest.fixture
48
+ def test_results():
49
+ """Test analysis results."""
50
+ aeon_df = pd.DataFrame(
51
+ {"Confidence": [0.95, 0.03, 0.02]}, index=["BRCA", "LUAD", "COAD"]
52
+ )
53
+ paladin_df = pd.DataFrame(
54
+ {"Biomarker": ["ER", "PR", "HER2"], "Prediction": [1, 1, 0]}
55
+ )
56
+ mask = Image.new("RGB", (100, 100), color="red")
57
+
58
+ return aeon_df, paladin_df, mask
59
+
60
+
61
+ def test_save_analysis_results(
62
+ temp_storage_base, test_user, test_analysis_id, test_slide_id, test_results
63
+ ):
64
+ """Test saving analysis results."""
65
+ aeon_df, paladin_df, mask = test_results
66
+
67
+ settings = {
68
+ "site_type": "Primary",
69
+ "cancer_subtype": "Unknown",
70
+ "ihc_subtype": "",
71
+ "seg_config": "Biopsy",
72
+ }
73
+
74
+ result = save_analysis_results(
75
+ username=test_user,
76
+ analysis_id=test_analysis_id,
77
+ slide_id=test_slide_id,
78
+ slide_name="test_slide.svs",
79
+ settings=settings,
80
+ aeon_results=aeon_df,
81
+ paladin_results=paladin_df,
82
+ slide_mask=mask,
83
+ )
84
+
85
+ assert result is True
86
+
87
+ # Verify files were created
88
+ from mosaic.user_results import _get_results_dir
89
+
90
+ results_dir = _get_results_dir(test_user, test_analysis_id)
91
+ assert (results_dir / "metadata.json").exists()
92
+ assert (results_dir / "aeon_results.csv").exists()
93
+ assert (results_dir / "paladin_results.csv").exists()
94
+ assert (results_dir / "slide_mask.png").exists()
95
+
96
+
97
+ def test_save_analysis_results_partial(
98
+ temp_storage_base, test_user, test_analysis_id, test_slide_id
99
+ ):
100
+ """Test saving results with some None values."""
101
+ settings = {"site_type": "Primary"}
102
+
103
+ result = save_analysis_results(
104
+ username=test_user,
105
+ analysis_id=test_analysis_id,
106
+ slide_id=test_slide_id,
107
+ slide_name="test_slide.svs",
108
+ settings=settings,
109
+ aeon_results=None,
110
+ paladin_results=None,
111
+ slide_mask=None,
112
+ )
113
+
114
+ assert result is True
115
+
116
+ # Metadata should still exist
117
+ from mosaic.user_results import _get_results_dir
118
+
119
+ results_dir = _get_results_dir(test_user, test_analysis_id)
120
+ assert (results_dir / "metadata.json").exists()
121
+
122
+
123
+ def test_load_analysis_results(
124
+ temp_storage_base, test_user, test_analysis_id, test_slide_id, test_results
125
+ ):
126
+ """Test loading analysis results."""
127
+ aeon_df, paladin_df, mask = test_results
128
+ settings = {"site_type": "Primary"}
129
+
130
+ # Save results first
131
+ save_analysis_results(
132
+ username=test_user,
133
+ analysis_id=test_analysis_id,
134
+ slide_id=test_slide_id,
135
+ slide_name="test_slide.svs",
136
+ settings=settings,
137
+ aeon_results=aeon_df,
138
+ paladin_results=paladin_df,
139
+ slide_mask=mask,
140
+ )
141
+
142
+ # Load results
143
+ loaded = load_analysis_results(test_user, test_analysis_id)
144
+ assert loaded is not None
145
+
146
+ metadata, loaded_mask, loaded_aeon, loaded_paladin = loaded
147
+
148
+ # Check metadata
149
+ assert isinstance(metadata, ResultMetadata)
150
+ assert metadata.analysis_id == test_analysis_id
151
+ assert metadata.slide_id == test_slide_id
152
+ assert metadata.slide_name == "test_slide.svs"
153
+ assert metadata.cancer_subtype == "BRCA" # Top prediction
154
+
155
+ # Check results
156
+ assert isinstance(loaded_mask, Image.Image)
157
+ assert isinstance(loaded_aeon, pd.DataFrame)
158
+ assert isinstance(loaded_paladin, pd.DataFrame)
159
+
160
+
161
+ def test_load_analysis_results_not_found(temp_storage_base, test_user):
162
+ """Test loading non-existent results."""
163
+ result = load_analysis_results(test_user, "nonexistent")
164
+ assert result is None
165
+
166
+
167
+ def test_list_user_results(temp_storage_base, test_user, test_slide_id, test_results):
168
+ """Test listing user results."""
169
+ aeon_df, paladin_df, mask = test_results
170
+ settings = {"site_type": "Primary"}
171
+
172
+ # Save multiple results
173
+ save_analysis_results(
174
+ username=test_user,
175
+ analysis_id="analysis-1",
176
+ slide_id=test_slide_id,
177
+ slide_name="slide1.svs",
178
+ settings=settings,
179
+ aeon_results=aeon_df,
180
+ paladin_results=paladin_df,
181
+ slide_mask=mask,
182
+ )
183
+
184
+ save_analysis_results(
185
+ username=test_user,
186
+ analysis_id="analysis-2",
187
+ slide_id="slide-other",
188
+ slide_name="slide2.svs",
189
+ settings=settings,
190
+ aeon_results=aeon_df,
191
+ paladin_results=paladin_df,
192
+ slide_mask=mask,
193
+ )
194
+
195
+ # List all results
196
+ all_results = list_user_results(test_user)
197
+ assert len(all_results) == 2
198
+
199
+ # List results for specific slide
200
+ slide_results = list_user_results(test_user, slide_id=test_slide_id)
201
+ assert len(slide_results) == 1
202
+ assert slide_results[0].analysis_id == "analysis-1"
203
+
204
+
205
+ def test_delete_analysis_results(
206
+ temp_storage_base, test_user, test_analysis_id, test_slide_id, test_results
207
+ ):
208
+ """Test deleting analysis results."""
209
+ aeon_df, paladin_df, mask = test_results
210
+ settings = {"site_type": "Primary"}
211
+
212
+ # Save results
213
+ save_analysis_results(
214
+ username=test_user,
215
+ analysis_id=test_analysis_id,
216
+ slide_id=test_slide_id,
217
+ slide_name="test_slide.svs",
218
+ settings=settings,
219
+ aeon_results=aeon_df,
220
+ paladin_results=paladin_df,
221
+ slide_mask=mask,
222
+ )
223
+
224
+ # Verify results exist
225
+ from mosaic.user_results import _get_results_dir
226
+
227
+ results_dir = _get_results_dir(test_user, test_analysis_id)
228
+ assert results_dir.exists()
229
+
230
+ # Delete results
231
+ assert delete_analysis_results(test_user, test_analysis_id) is True
232
+
233
+ # Verify results are gone
234
+ assert not results_dir.exists()
235
+
236
+ # Try deleting again (should return False)
237
+ assert delete_analysis_results(test_user, test_analysis_id) is False
238
+
239
+
240
+ def test_delete_results_for_slide(
241
+ temp_storage_base, test_user, test_slide_id, test_results
242
+ ):
243
+ """Test deleting all results for a slide."""
244
+ aeon_df, paladin_df, mask = test_results
245
+ settings = {"site_type": "Primary"}
246
+
247
+ # Save multiple results for the same slide
248
+ save_analysis_results(
249
+ username=test_user,
250
+ analysis_id="analysis-1",
251
+ slide_id=test_slide_id,
252
+ slide_name="test_slide.svs",
253
+ settings=settings,
254
+ aeon_results=aeon_df,
255
+ paladin_results=paladin_df,
256
+ slide_mask=mask,
257
+ )
258
+
259
+ save_analysis_results(
260
+ username=test_user,
261
+ analysis_id="analysis-2",
262
+ slide_id=test_slide_id,
263
+ slide_name="test_slide.svs",
264
+ settings=settings,
265
+ aeon_results=aeon_df,
266
+ paladin_results=paladin_df,
267
+ slide_mask=mask,
268
+ )
269
+
270
+ # Save result for different slide
271
+ save_analysis_results(
272
+ username=test_user,
273
+ analysis_id="analysis-3",
274
+ slide_id="other-slide",
275
+ slide_name="other.svs",
276
+ settings=settings,
277
+ aeon_results=aeon_df,
278
+ paladin_results=paladin_df,
279
+ slide_mask=mask,
280
+ )
281
+
282
+ # Delete results for test_slide_id
283
+ deleted = delete_results_for_slide(test_user, test_slide_id)
284
+ assert deleted == 2
285
+
286
+ # Verify correct results were deleted
287
+ remaining = list_user_results(test_user)
288
+ assert len(remaining) == 1
289
+ assert remaining[0].slide_id == "other-slide"
290
+
291
+
292
+ def test_create_results_zip(
293
+ temp_storage_base, test_user, test_analysis_id, test_slide_id, test_results
294
+ ):
295
+ """Test creating ZIP file from results."""
296
+ aeon_df, paladin_df, mask = test_results
297
+ settings = {"site_type": "Primary"}
298
+
299
+ # Save results
300
+ save_analysis_results(
301
+ username=test_user,
302
+ analysis_id=test_analysis_id,
303
+ slide_id=test_slide_id,
304
+ slide_name="test_slide.svs",
305
+ settings=settings,
306
+ aeon_results=aeon_df,
307
+ paladin_results=paladin_df,
308
+ slide_mask=mask,
309
+ )
310
+
311
+ # Create ZIP
312
+ zip_path = create_results_zip(test_user, test_analysis_id)
313
+ assert zip_path is not None
314
+ assert zip_path.exists()
315
+ assert zip_path.suffix == ".zip"
316
+
317
+ # Verify ZIP contains expected files
318
+ import zipfile
319
+
320
+ with zipfile.ZipFile(zip_path, "r") as zipf:
321
+ names = zipf.namelist()
322
+ assert "metadata.json" in names
323
+ assert "aeon_results.csv" in names
324
+ assert "paladin_results.csv" in names
325
+ assert "slide_mask.png" in names
326
+
327
+
328
+ def test_create_results_zip_not_found(temp_storage_base, test_user):
329
+ """Test creating ZIP for non-existent results."""
330
+ zip_path = create_results_zip(test_user, "nonexistent")
331
+ assert zip_path is None
332
+
333
+
334
+ def test_result_metadata_dataclass():
335
+ """Test ResultMetadata dataclass."""
336
+ metadata = ResultMetadata(
337
+ analysis_id="test-analysis",
338
+ slide_id="test-slide",
339
+ slide_name="test.svs",
340
+ timestamp="2024-01-01T00:00:00",
341
+ settings={"site_type": "Primary"},
342
+ cancer_subtype="BRCA",
343
+ )
344
+
345
+ assert metadata.analysis_id == "test-analysis"
346
+ assert metadata.slide_id == "test-slide"
347
+ assert metadata.cancer_subtype == "BRCA"
tests/test_user_storage.py ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Unit tests for user_storage module."""
2
+
3
+ import json
4
+ import tempfile
5
+ import uuid
6
+ from datetime import datetime
7
+ from pathlib import Path
8
+ from unittest.mock import patch
9
+
10
+ import pytest
11
+
12
+ from mosaic.user_storage import (
13
+ DEFAULT_QUOTA_BYTES,
14
+ SlideMetadata,
15
+ StorageInfo,
16
+ cleanup_old_files,
17
+ delete_user_slide,
18
+ get_slide_path,
19
+ get_storage_usage,
20
+ get_user_storage_dir,
21
+ list_user_slides,
22
+ save_uploaded_slide,
23
+ )
24
+
25
+
26
+ @pytest.fixture
27
+ def temp_storage_base(tmp_path, monkeypatch):
28
+ """Temporarily override storage base to tmp_path."""
29
+ monkeypatch.setattr("mosaic.user_storage.USER_STORAGE_BASE", tmp_path)
30
+ return tmp_path
31
+
32
+
33
+ @pytest.fixture
34
+ def test_user():
35
+ """Test username."""
36
+ return "test_user"
37
+
38
+
39
+ @pytest.fixture
40
+ def test_slide_file(tmp_path):
41
+ """Create a temporary test slide file."""
42
+ slide_path = tmp_path / "test_slide.svs"
43
+ # Create a 1KB test file
44
+ slide_path.write_bytes(b"x" * 1024)
45
+ return slide_path
46
+
47
+
48
+ def test_get_user_storage_dir(temp_storage_base, test_user):
49
+ """Test creating user storage directory."""
50
+ user_dir = get_user_storage_dir(test_user)
51
+
52
+ assert user_dir.exists()
53
+ assert user_dir == temp_storage_base / test_user
54
+ assert (user_dir / "slides").exists()
55
+ assert (user_dir / "results").exists()
56
+
57
+
58
+ def test_save_uploaded_slide(temp_storage_base, test_user, test_slide_file):
59
+ """Test saving an uploaded slide."""
60
+ slide_id, saved_path = save_uploaded_slide(test_user, test_slide_file)
61
+
62
+ # Check slide ID is a valid UUID
63
+ assert uuid.UUID(slide_id)
64
+
65
+ # Check file was copied
66
+ saved_path = Path(saved_path)
67
+ assert saved_path.exists()
68
+ assert saved_path.stat().st_size == 1024
69
+
70
+ # Check metadata was created
71
+ user_dir = get_user_storage_dir(test_user)
72
+ metadata_path = user_dir / "slides" / "metadata.json"
73
+ assert metadata_path.exists()
74
+
75
+ with open(metadata_path, "r") as f:
76
+ metadata = json.load(f)
77
+
78
+ assert slide_id in metadata
79
+ assert metadata[slide_id]["original_filename"] == "test_slide.svs"
80
+ assert metadata[slide_id]["size_bytes"] == 1024
81
+
82
+
83
+ def test_save_uploaded_slide_quota_exceeded(
84
+ temp_storage_base, test_user, test_slide_file
85
+ ):
86
+ """Test quota enforcement when saving slides."""
87
+ # Set quota to 512 bytes (smaller than test file)
88
+ with pytest.raises(ValueError, match="Storage quota exceeded"):
89
+ save_uploaded_slide(test_user, test_slide_file, quota_bytes=512)
90
+
91
+
92
+ def test_list_user_slides(temp_storage_base, test_user, test_slide_file):
93
+ """Test listing user slides."""
94
+ # Save two slides
95
+ slide_id1, _ = save_uploaded_slide(test_user, test_slide_file)
96
+ slide_id2, _ = save_uploaded_slide(test_user, test_slide_file)
97
+
98
+ slides = list_user_slides(test_user)
99
+
100
+ assert len(slides) == 2
101
+ # Should be sorted by upload time, newest first
102
+ assert slides[0].slide_id == slide_id2
103
+ assert slides[1].slide_id == slide_id1
104
+ assert all(isinstance(s, SlideMetadata) for s in slides)
105
+
106
+
107
+ def test_get_slide_path(temp_storage_base, test_user, test_slide_file):
108
+ """Test getting slide file path."""
109
+ slide_id, saved_path = save_uploaded_slide(test_user, test_slide_file)
110
+
111
+ retrieved_path = get_slide_path(test_user, slide_id)
112
+ assert retrieved_path == Path(saved_path)
113
+ assert retrieved_path.exists()
114
+
115
+ # Test non-existent slide
116
+ assert get_slide_path(test_user, "nonexistent") is None
117
+
118
+
119
+ def test_delete_user_slide(temp_storage_base, test_user, test_slide_file):
120
+ """Test deleting a slide."""
121
+ slide_id, saved_path = save_uploaded_slide(test_user, test_slide_file)
122
+
123
+ # Verify file exists
124
+ assert Path(saved_path).exists()
125
+
126
+ # Delete slide
127
+ assert delete_user_slide(test_user, slide_id) is True
128
+
129
+ # Verify file is gone
130
+ assert not Path(saved_path).exists()
131
+
132
+ # Verify metadata is updated
133
+ slides = list_user_slides(test_user)
134
+ assert len(slides) == 0
135
+
136
+ # Test deleting non-existent slide
137
+ assert delete_user_slide(test_user, "nonexistent") is False
138
+
139
+
140
+ def test_get_storage_usage(temp_storage_base, test_user, test_slide_file):
141
+ """Test calculating storage usage."""
142
+ # Initially empty
143
+ usage = get_storage_usage(test_user)
144
+ assert usage.total_bytes == 0
145
+ assert usage.file_count == 0
146
+ assert usage.quota_bytes == DEFAULT_QUOTA_BYTES
147
+
148
+ # Save a slide
149
+ save_uploaded_slide(test_user, test_slide_file)
150
+
151
+ # Check updated usage
152
+ usage = get_storage_usage(test_user)
153
+ assert usage.total_bytes == 1024
154
+ assert usage.file_count == 1
155
+
156
+
157
+ def test_cleanup_old_files(temp_storage_base, test_user, tmp_path):
158
+ """Test FIFO cleanup of old files."""
159
+ # Create three test files of different sizes
160
+ slide1 = tmp_path / "slide1.svs"
161
+ slide2 = tmp_path / "slide2.svs"
162
+ slide3 = tmp_path / "slide3.svs"
163
+
164
+ slide1.write_bytes(b"x" * 1024) # 1KB
165
+ slide2.write_bytes(b"x" * 2048) # 2KB
166
+ slide3.write_bytes(b"x" * 3072) # 3KB
167
+
168
+ # Save slides (oldest to newest)
169
+ id1, _ = save_uploaded_slide(test_user, slide1)
170
+ id2, _ = save_uploaded_slide(test_user, slide2)
171
+ id3, _ = save_uploaded_slide(test_user, slide3)
172
+
173
+ # Total usage: 6KB (1KB + 2KB + 3KB)
174
+ usage = get_storage_usage(test_user)
175
+ assert usage.total_bytes == 6144
176
+
177
+ # Cleanup to 4KB target
178
+ # Should delete oldest files: slide1 (1KB) + slide2 (2KB) = 3KB deleted
179
+ # Remaining: slide3 (3KB) which is ≤ 4KB target
180
+ deleted = cleanup_old_files(test_user, target_bytes=4096)
181
+ assert deleted == 2
182
+
183
+ # Verify slide1 and slide2 are gone, slide3 remains
184
+ assert get_slide_path(test_user, id1) is None
185
+ assert get_slide_path(test_user, id2) is None
186
+ assert get_slide_path(test_user, id3) is not None
187
+
188
+ # Check final usage
189
+ usage = get_storage_usage(test_user)
190
+ assert usage.total_bytes == 3072 # 3KB only
191
+
192
+
193
+ def test_cleanup_old_files_no_action_needed(
194
+ temp_storage_base, test_user, test_slide_file
195
+ ):
196
+ """Test cleanup when already under target."""
197
+ save_uploaded_slide(test_user, test_slide_file)
198
+
199
+ # Target is higher than current usage
200
+ deleted = cleanup_old_files(test_user, target_bytes=10240)
201
+ assert deleted == 0
202
+
203
+ # Slide should still exist
204
+ slides = list_user_slides(test_user)
205
+ assert len(slides) == 1
206
+
207
+
208
+ def test_storage_info_persistence(temp_storage_base, test_user, test_slide_file):
209
+ """Test that storage info persists across function calls."""
210
+ save_uploaded_slide(test_user, test_slide_file)
211
+
212
+ # Get usage first time
213
+ usage1 = get_storage_usage(test_user)
214
+
215
+ # Get usage again (should load from saved file)
216
+ usage2 = get_storage_usage(test_user)
217
+
218
+ assert usage1.total_bytes == usage2.total_bytes
219
+ assert usage1.file_count == usage2.file_count
220
+
221
+
222
+ def test_slide_metadata_dataclass():
223
+ """Test SlideMetadata dataclass."""
224
+ metadata = SlideMetadata(
225
+ slide_id="test-id",
226
+ original_filename="test.svs",
227
+ upload_time="2024-01-01T00:00:00",
228
+ size_bytes=1024,
229
+ file_extension=".svs",
230
+ )
231
+
232
+ assert metadata.slide_id == "test-id"
233
+ assert metadata.original_filename == "test.svs"
234
+ assert metadata.size_bytes == 1024
235
+
236
+
237
+ def test_storage_info_dataclass():
238
+ """Test StorageInfo dataclass."""
239
+ info = StorageInfo(
240
+ total_bytes=1024, file_count=5, quota_bytes=5368709120, last_cleanup=None
241
+ )
242
+
243
+ assert info.total_bytes == 1024
244
+ assert info.file_count == 5
245
+ assert info.quota_bytes == 5368709120
246
+ assert info.last_cleanup is None