Spaces:

unikill066
/

xenium_cell_segmentation

Running

App Files Files Community

unikill066 commited on Jun 13, 2025

Commit

c843d82

verified ·

1 Parent(s): da22f5d

Upload 29 files

Browse files

Files changed (29) hide show

LICENSE +21 -0
README.md +107 -0
bin/__init__.py +0 -0
bin/train_cellpose.py +31 -0
bin/train_cellpose_sam.py +54 -0
detect.py +67 -0
detect_sam.py +49 -0
docs/spinal_cord_cell_segmentation.md +1117 -0
generate_training_data.py +48 -0
main.py +60 -0
model/__init__.py +0 -0
model/run_cellpose.py +104 -0
model/run_cellpose_sam.py +34 -0
notebooks/trained_model_prediction.ipynb +0 -0
pyproject.toml +14 -0
requirements.txt +81 -0
streamlit_app.py +74 -0
utils/__init__.py +0 -0
utils/constants.py +41 -0
utils/generate_combine_masks.py +172 -0
utils/generate_geojson_qp_mask.py +74 -0
utils/generate_image_overlays.py +61 -0
utils/generate_masks.py +109 -0
utils/generate_metrics.py +0 -0
utils/generate_plots.py +175 -0
utils/generate_pngs.py +74 -0
utils/generate_split_images.py +93 -0
utils/generate_training_dataset.py +64 -0
utils/generate_training_split_img_masks.py +86 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 Nikhil I
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md ADDED Viewed

	@@ -0,0 +1,107 @@

+# Spinal Cord Segmentation Pipeline
+Automated, end‑to‑end processing and segmentation of spinal‑cord microscopy images with [Cellpose](https://cellpose.readthedocs.io/).
+## Overview
+This repository provides a **turn‑key workflow** for turning raw histological slides of the spinal cord (TIFF) into high‑quality, full‑resolution segmentation masks—in a *single command*.
+## Key Features
+| Stage | Purpose |
+|-------|---------|
+| **TIFF → PNG conversion** | Converts raw `.tiff` slides to compressed `.png`, with optional down‑scaling to speed up processing. |
+| **Smart tiling** | Splits very large images into manageable tiles that fit comfortably in GPU/CPU memory. |
+| **Cellpose inference** | Runs the *cyto3* (default) or any other Cellpose model on every tile. |
+| **Mask stitching** | Re‑assembles the individual tile masks into a single, full‑resolution segmentation mask. |
+## Requirements
+* Python **3.9+**
+* GPU‑enabled PyTorch build (optional but recommended)
+* Dependencies (installed automatically via `requirements.txt`):
+  * `cellpose==3.1.1.1`
+  * `opencv‑python`
+  * `numpy`
+  * `pillow`
+  * `tifffile`
+## Installation
+```bash
+# Clone the repository
+git clone https://github.com/your‑username/spinal‑cord‑segmentation.git
+cd spinal‑cord‑segmentation
+# Create / activate a virtualenv (optional but recommended)
+python -m venv .venv
+source .venv/bin/activate  # Windows: .venv\Scripts\activate
+# Install Python dependencies
+pip install -r requirements.txt
+```
+## Quick Start
+1. **Place** your raw `.tiff` images in `data/input/` (or adjust the paths in `bin/constants.py`).
+2. **Run** the pipeline:
+   ```bash
+   python main.py
+   ```
+3. **Collect** your results:
+   * PNG conversions → `data/png/`
+   * Split tiles → `data/tiles/`
+   * Cellpose masks → `data/masks/`
+   * Stitched masks → `data/output/`
+## Detailed Workflow
+```mermaid
+flowchart LR
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    classDef step fill:#fafafa,stroke:#333,stroke-width:1px;
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+*All paths, tile overlap, and Cellpose parameters are configurable in* **`bin/constants.py`**.
+## Project Layout
+```
+.
+├── main.py              # Orchestrates the full pipeline
+├── bin/
+│   ├── constants.py     # Centralised paths & tunables
+│   ├── generate_pngs.py # TIFF → PNG converter
+│   ├── generate_split_images.py
+│   └── generate_masks.py
+├── model/
+│   └── run_cellpose.py  # Wrapper around Cellpose API
+├── requirements.txt
+└── LICENSE
+```
+## License
+Distributed under the terms of the **MIT License**.  See `LICENSE` for full text.
+## Contributing
+Contributions, issues and feature requests are welcome!  Please open an issue or submit a pull request — and ensure your code passes `flake8`/`black` checks and includes appropriate tests.
+## Citation
+If you use this pipeline in your research, please cite *Cellpose* **and** this repository:
+```text
+@article{stringer_cellpose_2021,
+  title   = {Cellpose: a generalist algorithm for cellular segmentation},
+  author  = {Stringer, Carsen and Pachitariu, Marius},
+  journal = {Nature Methods},
+  year    = {2021}
+}
+```

bin/__init__.py ADDED Viewed

File without changes

bin/train_cellpose.py ADDED Viewed

	@@ -0,0 +1,31 @@

+from cellpose import io, models, train
+io.logger_setup()
+output = io.load_train_test_data(train_dir, test_dir, image_filter="_img", mask_filter="_masks", look_one_level_down=False)
+images, labels, image_names, test_images, test_labels, image_names_test = output
+model = models.CellposeModel(gpu=True)
+model_path, train_losses, test_losses = train.train_seg(model.net, train_data=images, train_labels=labels, test_data=test_images, test_labels=test_labels, weight_decay=0.1, learning_rate=1e-5, n_epochs=100, model_name="my_new_model")
+# training
+# python -m cellpose --train --dir /Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/train
+# --learning_rate 0.00001 --weight_decay 0.1 --n_epochs 100 --train_batch_size 1
+# python -m cellpose \
+#   --train \
+#   --dir /Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/train_old \
+#   --learning_rate 1e-5 \
+#   --weight_decay 0.1 \
+#   --n_epochs 100 \
+#   --batch_size 1 \
+#   --verbose
+# python -m cellpose \
+#   --train \
+#   --dir /Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/train \
+#   --learning_rate 1e-5 \
+#   --weight_decay 0.1 \
+#   --n_epochs 100 \
+#   --batch_size 1 \
+#   --verbose

bin/train_cellpose_sam.py ADDED Viewed

	@@ -0,0 +1,54 @@

+# imports
+import numpy as np
+from cellpose import models, core, io, plot, train
+from pathlib import Path
+from tqdm import trange
+import matplotlib.pyplot as plt
+io.logger_setup() # run this to get printing of progress
+train_dir = "/mnt/WorkingDos/cellpose_sam/8_hdrg_jayden_dataset_data"
+model_name = "cp_sam_hdrg_topoint_model"
+def train_cp_sam_model(train_dir, model_name, n_epochs=100, learning_rate=1e-5, weight_decay=0.1, batch_size=1):
+  """
+  Train a Cellpose model using the SAM (Segment Anything) algorithm.
+  Args:
+    train_dir (str): Path to the directory containing the training data.
+    model_name (str): Name of the model to be trained.
+    n_epochs (int): Number of epochs to train the model.
+    learning_rate (float): Learning rate for the optimizer.
+    weight_decay (float): Weight decay for the optimizer.
+    batch_size (int): Batch size for training.
+  Returns:
+    None
+    """
+  # Check if colab notebook instance has GPU access
+  if core.use_gpu()==False:raise ImportError("No GPU access, change your runtime")
+  model = models.CellposeModel(gpu=True)
+  if not Path(train_dir).exists():raise FileNotFoundError("directory does not exist")
+  test_dir = None # optionally you can specify a directory with test files
+  # *** change to your mask extension ***
+  # masks_ext = "_seg.npy"
+  # ^ assumes images from Cellpose GUI, if labels are tiffs, then "_masks.tif"
+  # list all files
+  masks_ext = "_masks"
+  files = [f for f in Path(train_dir).glob("*") if "_masks" not in f.name and "_flows" not in f.name and "_seg" not in f.name]
+  if(len(files)==0):raise FileNotFoundError("no files found, did you specify the correct folder and extension?")
+  else:print(f"{len(files)} files in folder:")
+  output = io.load_train_test_data(train_dir, test_dir, mask_filter=masks_ext)
+  train_data, train_labels, _, test_data, test_labels, _ = output
+  new_model_path, train_losses, test_losses = train.train_seg(model.net, train_data=train_data, train_labels=train_labels, batch_size=batch_size, n_epochs=n_epochs, learning_rate=learning_rate, weight_decay=weight_decay, nimg_per_epoch=max(2, len(train_data)), model_name=model_name)
+if __name__ == "__main__":
+  train_cp_sam_model(train_dir, model_name, n_epochs, learning_rate, weight_decay, batch_size)

detect.py ADDED Viewed

	@@ -0,0 +1,67 @@

+# imports
+import logging, numpy as np, matplotlib.pyplot as plt, os
+from pathlib import Path
+from model.run_cellpose import CellposeBatchProcessor
+from utils.constants import *
+from skimage.measure import label
+from tifffile import imwrite
+from utils.generate_masks import MaskStitcher
+from PIL import Image
+# # cellpose - masks
+# setup_logging(logging.INFO)
+# processor = CellposeBatchProcessor(input_dir=Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_test"),
+#                                   output_dir=Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_outs"),
+#                                   model_name="cyto3_restore", bsize=1024, overlap=0.15, batch_size=6, gpu=0, channels=(2,0), diameter=CELL_DIAMETER)
+# processor.process_all()
+setup_logging(logging.INFO)
+processor = CellposeBatchProcessor(
+    input_dir = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_test"),
+    output_dir = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_outs"),
+    # point to the folder that contains cellpose_model.pth + .yaml
+    model_name = "/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/train/models/cellpose_1746568542.462492",
+    bsize = 1024,
+    overlap = 0.15,
+    batch_size = 6,
+    gpu = 0,          # set to –1 if you must run CPU
+    channels = (2, 0),# or whatever channels you trained with
+    diameter = CELL_DIAMETER   # or None to auto‑scale
+)
+processor.process_all()
+## - x - x - x - x - x - x - x - x - x - x - x - x
+RANGE_CELL_DIAMETER = list(range(20, 60, 5))
+INPUT_DIR = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_test")
+OUTPUT_DIR = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_outs")
+# for CELL_DIAMETER in RANGE_CELL_DIAMETER:
+#     processor = CellposeBatchProcessor(input_dir=INPUT_DIR, output_dir=OUTPUT_DIR / f"{CELL_DIAMETER}",
+#                                   model_name="cyto3_restore", bsize=1024, overlap=0.15, batch_size=6,
+#                                   gpu=0, channels=(2,0), diameter=CELL_DIAMETER)
+#     processor.process_all()
+MASK_SUBDIR         = "masks"
+STITCHED_DIR        = OUTPUT_DIR / "stitched"
+STITCHED_DIR.mkdir(parents=True, exist_ok=True)
+first_masks = sorted((OUTPUT_DIR / str(RANGE_CELL_DIAMETER[0]) / MASK_SUBDIR).glob("*.png"))
+for mask_path in first_masks:
+    name = mask_path.name
+    union_mask = None
+    for d in RANGE_CELL_DIAMETER:
+        p = OUTPUT_DIR / str(d) / MASK_SUBDIR / name
+        arr = np.array(Image.open(p)) > 0
+        if union_mask is None:
+            union_mask = arr
+        else:
+            union_mask |= arr
+    union_lbl = label(union_mask)
+    out_tif = STITCHED_DIR / name.replace(".png", "tif")
+    imwrite(out_tif, union_lbl.astype(np.uint16))
+    from skimage.io import imsave
+    imsave(STITCHED_DIR / name, (union_mask * 255).astype(np.uint8))
+    print(f"Stitched: {name} → {out_tif.name}")

detect_sam.py ADDED Viewed

	@@ -0,0 +1,49 @@

+# imports
+import logging, numpy as np, matplotlib.pyplot as plt, os
+from pathlib import Path
+from model.run_cellpose import CellposeBatchProcessor
+from utils.constants import *
+from skimage.measure import label
+from tifffile import imwrite
+from utils.generate_masks import MaskStitcher
+from PIL import Image
+from cellpose.io import imread
+from cellpose import models, core, io, plot
+from tqdm import trange
+from natsort import natsorted
+image_ext = ".tif"
+masks_ext = ".png" if image_ext == ".png" else ".tif"
+flow_threshold = 0.8
+cellprob_threshold = 0.0
+tile_norm_blocksize = 0
+if core.use_gpu()==False:
+  raise ImportError("No GPU access, change your runtime")
+model = models.CellposeModel(pretrained_model="/Users/discovery/Desktop/spinal_cord_segmentation/model/cellpose_sam_neun", gpu=True)
+# print(models.model_path("/Users/discovery/Desktop/spinal_cord_segmentation/model/cellpose_sam_neun"))
+input_dir = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_imgs/data")
+output_dir = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/cellpose_outs")
+output_dir.mkdir(parents=True, exist_ok=True)
+files = natsorted([f for f in input_dir.glob("*"+image_ext) if "_masks" not in f.name and "_flows" not in f.name])
+if(len(files)==0):
+  raise FileNotFoundError("no image files found, did you specify the correct folder and extension?")
+else:
+  print(f"{len(files)} images in folder:")
+# for f in files:
+#   print(f)
+imgs = [io.imread(files[i]) for i in trange(len(files))]
+masks, flows, styles = model.eval(imgs, batch_size=32, flow_threshold=flow_threshold, cellprob_threshold=cellprob_threshold, normalize={"tile_norm_blocksize": tile_norm_blocksize})
+print("saving masks")
+for i in trange(len(files)):
+    f = files[i]
+    io.imsave(output_dir / (f.stem + "_pred_masks" + masks_ext), masks[i])

docs/spinal_cord_cell_segmentation.md ADDED Viewed

	@@ -0,0 +1,1117 @@

+# Wiki Documentation for https://github.com/unikill066/spinal_cord_cell_segmentation
+Generated on: 2025-06-12 15:01:36
+## Table of Contents
+- [Home Page](#page-1)
+- [System Architecture](#page-2)
+- [Core Features](#page-3)
+- [Data Management/Flow](#page-4)
+- [Frontend Components](#page-5)
+- [Backend Systems](#page-6)
+- [Model Integration](#page-7)
+- [Deployment/Infrastructure](#page-8)
+- [Extensibility and Customization](#page-9)
+<a id='page-1'></a>
+## Home Page
+<details>
+<summary>Relevant source files</summary>
+- README.md
+- utils/generate_split_images.py
+- utils/generate_pngs.py
+- model/run_cellpose.py
+- generate_training_data.py
+</details>
+# Home Page
+This page provides an overview of the core functionalities and architecture of the Spinal Cord Cell Segmentation project. The project is designed to automate the segmentation of spinal cord images using the Cellpose algorithm, with a focus on high-resolution image processing and efficient workflow management.
+## Introduction
+The Spinal Cord Cell Segmentation project offers a turnkey workflow for converting raw histological slides into high-quality segmentation masks. The workflow includes TIFF to PNG conversion, smart tiling, Cellpose inference, and mask stitching. The system is designed to be flexible, with configurable parameters and paths managed through the `bin/constants.py` file. The project is built with the goal of enabling researchers and developers to efficiently process and analyze spinal cord images for research and clinical applications.
+## Detailed Sections
+### Architecture and Components
+The project is structured with a clear modular architecture, consisting of several key components:
+- **Main.py**: Orchestrates the full pipeline, managing the execution of various steps such as TIFF to PNG conversion, tiling, segmentation, and mask stitching.
+- **bin/constants.py**: Centralized configuration for paths and tunables, allowing users to customize the workflow without modifying the main code.
+- **utils/generate_split_images.py**: Splits PNG images into sub-images for efficient processing, with configurable tile sizes.
+- **utils/generate_pngs.py**: Converts TIFF images to PNG format with optional downscaling for performance.
+- **model/run_cellpose.py**: Implements the Cellpose algorithm for segmentation, with configurable parameters such as model name, batch size, and tile overlap.
+- **generate_training_data.py**: Generates training data for the Cellpose model, including PNG images and corresponding GeoJSON files for mask generation.
+### Key Functions and Classes
+- **CellposeBatchProcessor**: A class that processes a directory of images using the Cellpose algorithm, saving masks, previews, and segmentation arrays into designated directories.
+- **ImageSplitter**: A class that splits PNG images into sub-images, with configurable tile sizes.
+- **MaskStitcher**: A class that stitches tiled .npy masks into full-size masks.
+- **OverlayGenerator**: A class that creates overlays of original PNGs with their corresponding masks and generates side-by-side comparison mosaics.
+### Mermaid Diagrams
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]
+    B --> C[generate_split_images.py]
+    C --> D[run_cellpose.py]
+    D --> E[generate_masks.py]
+    E --> F[Final segmentation]
+```
+This diagram shows the flow of the pipeline from raw TIFF images to final segmentation masks.
+### Tables
+| Component | Description |
+|----------|-------------|
+| `bin/constants.py` | Centralized configuration for paths and tunables |
+| `utils/generate_split_images.py` | Splits PNG images into sub-images |
+| `utils/generate_pngs.py` | Converts TIFF images to PNG format |
+| `model/run_cellpose.py` | Implements the Cellpose algorithm for segmentation |
+| `generate_training_data.py` | Generates training data for the Cellpose model |
+### Code Snippets
+```python
+# Example of a CellposeBatchProcessor call
+processor = CellposeBatchProcessor(
+    input_dir=SPLIT_IMAGES_DIR,
+    output_dir=CELLPOSE_MASKS_DIR,
+    model_name="cyto3_restore",
+    bsize=2048,
+    overlap=0.15,
+    batch_size=6,
+    gpu=0,
+    channels=(1, 0)
+)
+processor.process_all()
+```
+### Source Citations
+- `README.md`: Overview of the project and its components
+- `utils/generate_split_images.py`: Implementation of image splitting
+- `utils/generate_pngs.py`: Implementation of TIFF to PNG conversion
+- `model/run_cellpose.py`: Implementation of the Cellpose algorithm
+- `generate_training_data.py`: Implementation of training data generation
+This page provides a comprehensive overview of the Spinal Cord Cell Segmentation project, focusing on its architecture, components, and key functionalities. The project is designed to be flexible and efficient, with configurable parameters and paths managed through the `bin/constants.py` file. The workflow includes TIFF to PNG conversion, smart tiling, Cellpose inference, and mask stitching, making it a powerful tool for researchers and developers working with spinal cord images.
+---
+<a id='page-2'></a>
+## System Architecture
+<details>
+<summary>Relevant source files</summary>
+- README.md
+- utils/constants.py
+- model/run_cellpose.py
+- utils/generate_pngs.py
+- utils/generate_split_images.py
+</details>
+# System Architecture
+This system architecture provides an end-to-end pipeline for segmenting spinal cord images using Cellpose. The architecture is designed to be modular, configurable, and efficient, with all components centralized in the `bin/` directory.
+## Introduction
+The system is built around a central orchestrator `main.py` that coordinates the workflow from TIFF image input to final segmentation masks. The architecture is composed of several key components:
+- **TIFF → PNG Conversion**: Converts raw TIFF images to compressed PNGs with optional downscaling.
+- **Smart Tiling**: Splits large images into manageable tiles for GPU/CPU processing.
+- **Cellpose Inference**: Runs the cyto3 model or any other Cellpose model on each tile.
+- **Mask Stitching**: Re-assembles tile masks into a full-resolution segmentation mask.
+All paths, tile overlap, and Cellpose parameters are configurable in `bin/constants.py`.
+## Detailed Sections
+### 1. Pipeline Overview
+The pipeline is structured as a series of steps that are executed in a single command:
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+### 2. Core Components
+#### 2.1. Orchestration
+The orchestration is handled by `main.py`, which loads the configuration from `bin/constants.py` and runs the pipeline. The pipeline is designed to be extensible, with each step being a separate module.
+#### 2.2. Image Processing
+The `generate_pngs.py` module converts TIFF images to PNGs with optional downscaling. The `generate_split_images.py` module splits large images into smaller tiles for efficient processing.
+#### 2.3. Cellpose Inference
+The `run_cellpose.py` module is a wrapper around the Cellpose API. It supports multiple models, including the cyto3 model, and allows configuration of parameters such as tile overlap, batch size, and resampling.
+#### 2.4. Mask Stitching
+The `generate_masks.py` module stitches the tile masks into a single, full-resolution segmentation mask. This is done using the `MaskStitcher` class, which is responsible for grouping tiles by stem and stitching them back together.
+### 3. Mermaid Diagrams
+```mermaid
+graph TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+### 4. Tables
+#### 4.1. Configuration Options
+| Configuration | Type | Default |
+|---------------|------|---------|
+| `constants.py` | File | Default values |
+| `run_cellpose.py` | Module | Default parameters |
+#### 4.2. Key Features
+| Feature | Description |
+|--------|-------------|
+| TIFF → PNG conversion | Converts raw TIFF images to compressed PNGs |
+| Smart tiling | Splits large images into manageable tiles |
+| Cellpose inference | Runs the cyto3 model or any other Cellpose model |
+| Mask stitching | Re-assembles tile masks into a full-resolution segmentation mask |
+### 5. Code Snippets
+```python
+# Example of a configuration in constants.py
+IMAGE_INPUT_DIR = Path("data/input")
+IMAGE_OUTPUT_DIR = Path("data/output")
+```
+```python
+# Example of a Cellpose inference in run_cellpose.py
+model = models.CellposeModel(gpu=True, pretrained_model=model_path)
+```
+### 6. Source Citations
+- `README.md`: Overview of the project and its components.
+- `utils/constants.py`: Configuration parameters and paths.
+- `model/run_cellpose.py`: Core Cellpose inference logic.
+- `utils/generate_pngs.py`: TIFF to PNG conversion.
+- `utils/generate_split_images.py`: Image splitting and tiling.
+---
+<a id='page-3'></a>
+## Core Features
+<details>
+<summary>Relevant source files</summary>
+- README.md
+- utils/generate_masks.py
+- utils/generate_pngs.py
+- utils/generate_split_images.py
+- utils/generate_image_overlays.py
+</details>
+# Core Features
+This section provides an overview of the core features of the spinal cord cell segmentation pipeline, focusing on the key components and functionalities implemented in the project.
+## Introduction
+The spinal cord cell segmentation pipeline is designed to automate the process of converting raw histological slides into high-resolution segmentation masks. The pipeline is structured to handle large images efficiently, leveraging smart tiling and GPU-accelerated inference using the Cellpose framework. The core features include TIFF to PNG conversion, tile splitting, Cellpose inference, and mask stitching, all configurable through `bin/constants.py`.
+## Detailed Sections
+### 1. Image Processing Pipeline
+The pipeline consists of several stages that work together to process and segment images:
+- **TIFF → PNG Conversion**: Converts raw TIFF images into compressed PNG files, allowing for optional downscaling to speed up processing.
+- **Smart Tiling**: Splits large images into manageable tiles that fit comfortably in GPU/CPU memory, enabling efficient processing.
+- **Cellpose Inference**: Runs the *cyto3* (default) or any other Cellpose model on every tile, leveraging the power of GPU for fast inference.
+- **Mask Stitching**: Re-assembles the individual tile masks into a single, full-resolution segmentation mask.
+### 2. Key Components and Architecture
+The pipeline is orchestrated by `main.py`, which orchestrates the full pipeline, and is supported by several key components:
+- **`bin/constants.py`**: Contains configurable paths and tunables for all components.
+- **`utils/generate_masks.py`**: Provides the `MaskStitcher` class to stitch tiled masks into full-size masks.
+- **`utils/generate_split_images.py`**: Provides the `ImageSplitter` class to split PNG images into sub-images.
+- **`utils/generate_pngs.py`**: Provides the `TiffToPngConverter` class to convert TIFF images into PNG files.
+- **`utils/generate_image_overlays.py`**: Provides the `OverlayGenerator` class to generate overlays and comparisons between original images and their corresponding masks.
+### 3. Mermaid Diagram
+```mermaid
+graph TD
+    A[TIFF images] --> B[generate_pngs.py]
+    B --> C[generate_split_images.py]
+    C --> D[run_cellpose.py]
+    D --> E[generate_masks.py]
+    E --> F[Final segmentation]
+```
+### 4. Tables
+| Feature | Description |
+|--------|-------------|
+| TIFF → PNG Conversion | Converts raw TIFF images into compressed PNG files. |
+| Smart Tiling | Splits large images into manageable tiles for efficient processing. |
+| Cellpose Inference | Runs the *cyto3* (default) or any other Cellpose model on every tile. |
+| Mask Stitching | Re-assembles the individual tile masks into a single, full-resolution segmentation mask. |
+| Configurable Parameters | All paths, tile overlap, and Cellpose parameters are configurable in `bin/constants.py`. |
+### 5. Code Snippets
+```python
+# Example of a mask stitcher
+class MaskStitcher:
+    def __init__(self, input_dir: Path, output_dir: Path) -> None:
+        self.input_dir = Path(input_dir)
+        self.output_dir = Path(output_dir)
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self._setup_output_directory()
+    def _setup_output_directory(self) -> None:
+        try:
+            self.output_dir.mkdir(parents=True, exist_ok=True)
+            self.logger.debug(f"Output directory ready: {self.output_dir}")
+        except Exception as e:
+            self.logger.error(f"Could not create output directory {self.output_dir}: {e}")
+            raise
+    def stitch_all(self) -> None:
+        all_files = list(self.input_dir.glob("*.npy"))
+        if not all_files:
+            self.logger.warning(f"No .npy files found in {self.input_dir}")
+            return
+        # group files by stem
+        stems = {}
+        for p in all_files:
+            m = self.TILE_PATTERN.match(p.name)
+            if not m:
+                self.logger.warning(f"Skipping unrecognized file name: {p.name}")
+                continue
+            stem = m.group("stem")
+            stems.setdefault(stem, []).append(p)
+        for stem, paths in stems.items():
+            try:
+                self._stitch_stem(stem, paths)
+                self.logger.info(f"Stitched mask for '{stem}' → {stem}.npy")
+            except Exception:
+                self.logger.exception(f"Failed to stitch tiles for '{stem}'")
+```
+### 6. Source Citations
+- `utils/generate_masks.py`: `Sources: utils/generate_masks.py:12-15()`
+- `utils/generate_split_images.py`: `Sources: utils/generate_split_images.py:28-31()`
+- `utils/generate_pngs.py`: `Sources: utils/generate_pngs.py:45-48()`
+- `utils/generate_image_overlays.py`: `Sources: utils/generate_image_overlays.py:60-63()`
+- `main.py`: `Sources: main.py:10-12()`
+---
+<a id='page-4'></a>
+## Data Management/Flow
+<details>
+<summary>Relevant source files</summary>
+- [README.md](README.md)
+- [utils/generate_training_dataset.py](utils/generate_training_dataset.py)
+- [utils/generate_training_split_img_masks.py](utils/generate_training_split_img_masks.py)
+- [utils/generate_masks.py](utils/generate_masks.py)
+- [utils/generate_split_images.py](utils/generate_split_images.py)
+</details>
+# Data Management/Flow
+This page provides a comprehensive overview of the data management and flow architecture within the **spinal_cord_cell_segmentation** project. The system is designed to automate the processing of spinal cord microscopy images, converting raw TIFF files into high-resolution segmentation masks through a series of configurable steps.
+## Introduction
+The data management and flow system is a critical component of the project, responsible for orchestrating the entire pipeline from image acquisition to final segmentation. It leverages pre-trained models like Cellpose to perform automated segmentation, with configurable parameters allowing users to fine-tune the process. The system supports various stages, including TIFF to PNG conversion, smart tiling, model inference, and mask stitching, all of which are customizable and can be adjusted through the `bin/constants.py` file.
+## Detailed Sections
+### 1. Pipeline Architecture
+The pipeline is structured as a series of steps, each of which is responsible for a specific part of the image processing workflow. These steps are orchestrated by the `main.py` file, which acts as the central controller of the entire process.
+### 2. Key Components
+#### a. `bin/constants.py`
+This file contains all the configurable parameters for the pipeline, including paths to input and output directories, model parameters, and other tunable settings. Users can adjust these values to suit their specific needs.
+#### b. `utils/generate_training_split_img_masks.py`
+This module handles the generation of training data by splitting and processing TIFF images into smaller tiles. It uses the `ImageSplitter` class to split images into manageable tiles and the `MaskStitcher` class to stitch them back into full-resolution masks.
+#### c. `utils/generate_masks.py`
+This file contains the implementation of the mask stitching functionality. It uses the `MaskStitcher` class to group tiles by their stem and stitch them back into a single, full-resolution mask. The `generate_pngs.py` file handles the initial conversion of TIFF images into PNG files, which are then processed by the `generate_split_images.py` file to create tiles.
+#### d. `utils/generate_split_images.py`
+This module is responsible for splitting large TIFF images into smaller tiles that can be processed efficiently by the GPU or CPU. It uses the `ImageSplitter` class to split images into manageable tiles and ensures that the tiles are saved in the appropriate directory.
+#### e. `utils/generate_pngs.py`
+This file handles the conversion of TIFF images into PNG files, applying a scaling factor to resize the images. It is used in conjunction with `generate_split_images.py` to create the necessary tiles for the segmentation process.
+### 3. Mermaid Diagrams
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+This flowchart illustrates the data management and flow architecture of the project. It shows how the pipeline is structured from image input to final segmentation, with each step being responsible for a specific part of the process.
+### 4. Tables
+#### a. Key Features and Components
+| Component | Description |
+|----------|-------------|
+| `bin/constants.py` | Configurable parameters for the pipeline |
+| `utils/generate_training_split_img_masks.py` | Data generation and splitting for training |
+| `utils/generate_masks.py` | Mask stitching functionality |
+| `utils/generate_split_images.py` | Image splitting and tiling |
+| `utils/generate_pngs.py` | TIFF to PNG conversion and scaling |
+#### b. API Endpoints and Parameters
+| Endpoint | Description |
+|---------|-------------|
+| `main.py` | Central controller of the pipeline |
+| `generate_pngs.py` | TIFF to PNG conversion and scaling |
+| `generate_split_images.py` | Image splitting and tiling |
+| `generate_masks.py` | Mask stitching functionality |
+| `run_cellpose.py` | Cellpose model inference and segmentation |
+### 5. Code Snippets
+```python
+# Example of a configuration in bin/constants.py
+IMAGE_INPUT_DIR = Path("data/input")
+IMAGE_OUTPUT_DIR = Path("data/output")
+```
+```python
+# Example of a function in generate_pngs.py
+def convert_all(self) -> None:
+    tif_files = list(self.tif_dir.glob("*.tif"))
+    if not tif_files:
+        self.logger.warning(f"No .tif files found in {self.tif_dir}")
+        return
+    for tif_path in tif_files:
+        try:
+            self.convert_file(tif_path)
+        except Exception:
+            self.logger.exception(f"Error converting file: {tif_path}")
+```
+### 6. Source Citations
+- `Sources: [utils/generate_training_split_img_masks.py:12-15]()` - Data generation and splitting for training
+- `Sources: [utils/generate_masks.py:30-35]()` - Mask stitching functionality
+- `Sources: [utils/generate_split_images.py:40-45]()` - Image splitting and tiling
+- `Sources: [utils/generate_pngs.py:60-65]()` - TIFF to PNG conversion and scaling
+- `Sources: [bin/constants.py:10-15]()` - Configurable parameters for the pipeline
+---
+<a id='page-5'></a>
+## Frontend Components
+<details>
+<summary>Relevant source files</summary>
+- streamlit_app.py
+- utils/generate_pngs.py
+- utils/generate_split_images.py
+- utils/generate_masks.py
+- model/run_cellpose.py
+</details>
+# Frontend Components
+This section describes the frontend components of the spinal cord cell segmentation pipeline, focusing on the parts that handle user interaction, data processing, and visualization.
+## Introduction
+The frontend of the pipeline is responsible for handling user input, processing data, and providing visual feedback. It includes components for image loading, segmentation, and output generation. The frontend is built using Python and relies on several core modules to provide a seamless experience for users.
+## Detailed Sections
+### 1. Image Processing and Segmentation
+The frontend processes raw TIFF images, converts them to PNG format, and performs segmentation using Cellpose. Key components include:
+- **TiffToPngConverter**: Converts TIFF images to PNG format with scaling.
+- **GeneratePNGs**: Handles the conversion of TIFF images to PNG files.
+- **GenerateSplitImages**: Splits large images into manageable tiles for efficient processing.
+- **RunCellpose**: Wraps the Cellpose API to perform segmentation on each tile.
+- **GenerateMasks**: Stitches the segmented masks into a full-resolution output.
+### 2. Visualization and Output Generation
+The frontend provides visualizations of the segmentation results, including overlays and comparisons. Key components include:
+- **OverlayGenerator**: Creates overlays of original images with segmented masks and generates side-by-side comparisons.
+- **GenerateOverlays**: Handles the creation of overlays and comparisons for each image.
+- **StreamlitApp**: Provides a web-based interface for users to interact with the pipeline.
+### 3. Data Flow and Architecture
+The frontend data flow is as follows:
+- **Input**: User uploads a TIFF image or selects an image from the input directory.
+- **Processing**: The pipeline converts the image to PNG, splits it into tiles, segments each tile, and stitches the masks together.
+- **Output**: Segmented masks are saved in various formats (Numpy, TIFF, GeoJSON, etc.) and can be visualized or downloaded.
+### 4. Key Components and Functions
+- **TiffToPngConverter**: Converts TIFF images to PNG format with scaling.
+- **GeneratePNGs**: Handles the conversion of TIFF images to PNG files.
+- **GenerateSplitImages**: Splits large images into manageable tiles for efficient processing.
+- **RunCellpose**: Wraps the Cellpose API to perform segmentation on each tile.
+- **GenerateMasks**: Stitches the segmented masks into a full-resolution output.
+- **OverlayGenerator**: Creates overlays of original images with segmented masks and generates side-by-side comparisons.
+- **StreamlitApp**: Provides a web-based interface for users to interact with the pipeline.
+### 5. Mermaid Diagrams
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]
+    B --> C[generate_split_images.py]
+    C --> D[run_cellpose.py]
+    D --> E[generate_masks.py]
+    E --> F[Final segmentation]
+```
+### 6. Tables
+| Component | Description |
+|----------|-------------|
+| TiffToPngConverter | Converts TIFF images to PNG format with scaling. |
+| GeneratePNGs | Handles the conversion of TIFF images to PNG files. |
+| GenerateSplitImages | Splits large images into manageable tiles for efficient processing. |
+| RunCellpose | Wraps the Cellpose API to perform segmentation on each tile. |
+| GenerateMasks | Stitches the segmented masks into a full-resolution output. |
+### 7. Code Snippets
+```python
+# Example of converting TIFF to PNG
+img_array = tifffile.imread(str(tif_path), level=0)
+img = Image.fromarray(img_array)
+new_size = (int(img_array.shape[1] * self.scaling_factor), int(img_array.shape[0] * self.scaling_factor))
+img_resized = img.resize(new_size, resample=Image.LANCZOS)
+output_path = self.output_dir / tif_path.with_suffix(".png").name
+img_resized.save(output_path, format="PNG")
+```
+### 8. Source Citations
+- **TiffToPngConverter**: `utils/generate_pngs.py:12-15`
+- **GeneratePNGs**: `utils/generate_pngs.py:12-15`
+- **GenerateSplitImages**: `utils/generate_split_images.py:12-15`
+- **RunCellpose**: `model/run_cellpose.py:12-15`
+- **GenerateMasks**: `utils/generate_masks.py:12-15`
+---
+<a id='page-6'></a>
+## Backend Systems
+<details>
+<summary>Relevant source files</summary>
+- main.py
+- bin/constants.py
+- model/run_cellpose.py
+- utils/generate_split_images.py
+- utils/generate_masks.py
+</details>
+# Backend Systems
+## Introduction
+The "Backend Systems" component of the spinal cord cell segmentation project is responsible for orchestrating the full pipeline from raw image input to final segmentation masks. This includes handling image conversion, tile splitting, model inference, mask stitching, and output generation. The system is designed to be modular, configurable, and efficient, with all paths, tile overlap, and Cellpose parameters configurable in `bin/constants.py`.
+The backend systems are built around a central orchestrator in `main.py`, which coordinates the execution of various modules. These modules include:
+- `generate_pngs.py`: Converts TIFF images to PNGs with optional downscaling.
+- `generate_split_images.py`: Splits large images into manageable tiles.
+- `generate_masks.py`: Runs Cellpose inference and stitches masks into final outputs.
+- `run_cellpose.py`: Wraps Cellpose API for model inference and parameter tuning.
+- `constants.py`: Centralized configuration for paths and tunables.
+## Detailed Sections
+### 1. Pipeline Architecture
+The backend systems follow a standardized workflow, represented by a Mermaid flowchart:
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+This flowchart shows how the pipeline processes images:
+1. Converts TIFF images to PNGs with optional downscaling.
+2. Splits large images into tiles for efficient GPU/CPU memory usage.
+3. Runs Cellpose inference on each tile with configurable parameters.
+4. Stitches individual tile masks into a full-resolution segmentation mask.
+### 2. Key Components and Functions
+#### 2.1 Image Conversion and Downscaling
+The `generate_pngs.py` module handles image conversion and downscaling. It uses `tifffile` to read TIFF images and `Pillow` to convert them to PNGs. The scaling factor is configurable in `bin/constants.py`, allowing users to adjust image resolution for processing speed or output quality.
+#### 2.2 Tile Splitting and Memory Management
+The `generate_split_images.py` module splits images into tiles using the `ImageSplitter` class. It calculates the number of rows and columns based on tile width and height, ensuring that tiles fit comfortably in memory. This is crucial for handling large images efficiently.
+#### 2.3 Cellpose Inference and Model Configuration
+The `run_cellpose.py` module wraps the Cellpose API, allowing users to specify different models (e.g., `cyto3_restore`) and parameters such as `bsize`, `overlap`, `batch_size`, and `diameter`. These parameters are configurable in `bin/constants.py`, providing flexibility for different use cases.
+#### 2.4 Mask Stitching and Output Generation
+The `generate_masks.py` module stitches individual tile masks into a full-resolution segmentation mask. This is done using the `MaskStitcher` class, which groups tiles by stem and reconstructs the full mask. The output is saved in the `data/output/` directory, with each stem having its own mask.
+#### 2.5 Configuration and Tuning
+The `bin/constants.py` file contains all the configuration parameters for the system. Users can adjust paths, tile sizes, model names, and other parameters to suit their specific needs. This centralization ensures that the system is easy to configure and maintain.
+### 3. Mermaid Diagrams
+#### 3.1 Pipeline Flow
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+#### 3.2 Module Dependencies
+```mermaid
+graph TD
+    A[main.py] --> B[generate_pngs.py]
+    A --> C[generate_split_images.py]
+    A --> D[run_cellpose.py]
+    A --> E[generate_masks.py]
+```
+### 4. Tables
+#### 4.1 Key Features and Components
+| Feature | Description |
+|--------|-------------|
+| Image Conversion | Converts TIFF images to PNGs with optional downscaling. |
+| Tile Splitting | Splits large images into manageable tiles for efficient processing. |
+| Cellpose Inference | Runs Cellpose API with configurable parameters for model inference. |
+| Mask Stitching | Stitches individual tile masks into full-resolution segmentation masks. |
+| Configuration | Centralized configuration for paths, tile sizes, and model parameters. |
+#### 4.2 API Endpoints and Parameters
+| Endpoint | Description | Parameters |
+|---------|-------------|-------------|
+| `main.py` | Orchestrates the full pipeline | Configurable in `bin/constants.py` |
+| `run_cellpose.py` | Wraps Cellpose API | Configurable in `bin/constants.py` |
+### 5. Code Snippets
+#### 5.1 Image Conversion in `generate_pngs.py`
+```python
+def convert_all(self) -> None:
+    """
+    Convert all .tif files in the source directory.
+    """
+    tif_files = list(self.tif_dir.glob("*.tif"))
+    if not tif_files:
+        self.logger.warning(f"No .tif files found in {self.tif_dir}")
+        return
+    for tif_path in tif_files:
+        try:
+            self.convert_file(tif_path)
+        except Exception:
+            self.logger.exception(f"Error converting file: {tif_path}")
+```
+#### 5.2 Tile Splitting in `generate_split_images.py`
+```python
+def split_file(self, png_path: Path) -> None:
+    """
+    Split a single PNG image into sub-images.
+    """
+    with Image.open(png_path) as pil_img:
+        img = np.array(pil_img)
+    self.logger.debug(f"Loaded {png_path.name} with shape {img.shape}")
+    height, width = img.shape[:2]
+    cols = (width + self.sub_w - 1) // self.sub_w
+    rows = (height + self.sub_h - 1) // self.sub_h
+    for row in range(rows):
+        for col in range(cols):
+            x0 = col * self.sub_w
+            y0 = row * self.sub_h
+```
+### 6. Source Citations
+- `generate_pngs.py: 12-15` (image conversion and downscaling)
+- `generate_split_images.py: 20-25` (tile splitting and memory management)
+- `run_cellpose.py: 30-35` (Cellpose inference and configuration)
+- `generate_masks.py: 40-45` (mask stitching and output generation)
+- `bin/constants.py: 10-15` (configuration parameters)
+---
+<a id='page-7'></a>
+## Model Integration
+<details>
+<summary>Relevant source files</summary>
+- [model/run_cellpose.py](model/run_cellpose.py)
+- [model/run_cellpose_sam.py](model/run_cellpose_sam.py)
+- [utils/generate_pngs.py](utils/generate_pngs.py)
+- [utils/generate_split_images.py](utils/generate_split_images.py)
+- [utils/generate_masks.py](utils/generate_masks.py)
+</details>
+# Model Integration
+This wiki page focuses on the **Model Integration** component of the spinal cord cell segmentation project. It covers the integration of the Cellpose SAM model into the pipeline, the processing of images, and the generation of masks, overlays, and geojson files.
+## Detailed Sections
+### 1. **Architecture and Components**
+The model integration involves several key components:
+- **Cellpose SAM Model**: A pre-trained model for cell segmentation using the Cellpose SAM framework.
+- **Cellpose Batch Processor**: A wrapper around the Cellpose API that handles image processing, including segmentation, mask stitching, and visualization.
+- **Image Splitter**: A utility that splits large images into manageable tiles for efficient processing.
+- **Mask Stitcher**: A tool that stitches individual tile masks into a full-resolution mask.
+- **Plot Generator**: A utility that generates overlays and comparisons between the original image and the segmented mask.
+### 2. **Data Flow and Logic**
+The data flow is as follows:
+1. **TIFF to PNG Conversion**: The `TiffToPngConverter` converts TIFF images into PNG files, applying a scaling factor for efficiency.
+2. **Image Splitting**: The `ImageSplitter` splits PNG images into sub-images for processing.
+3. **Cellpose Segmentation**: The `CellposeBatchProcessor` runs the Cellpose SAM model on each tile, generating masks and flows.
+4. **Mask Stitching**: The `NPYMaskStitcher` stitches the individual tile masks into a single full-resolution mask.
+5. **Plot Generation**: The `PlotGenerator` generates overlays and comparisons between the original image and the segmented mask.
+### 3. **Mermaid Diagrams**
+```mermaid
+graph TD
+    A[TIFF Images] --> B[Generate PNGs]
+    B --> C[Split Images]
+    C --> D[Cellpose Segmentation]
+    D --> E[Generate Masks]
+    E --> F[Stitch Masks]
+    F --> G[Generate Overlays & Comparisons]
+```
+### 4. **Tables**
+| Component               | Description                                                                 |
+|------------------------|-----------------------------------------------------------------------------|
+| `TiffToPngConverter`   | Converts TIFF images to PNG format with scaling.                             |
+| `ImageSplitter`        | Splits PNG images into sub-images of specified size.                          |
+| `CellposeBatchProcessor` | Wraps the Cellpose API for segmentation, mask stitching, and visualization.    |
+| `NPYMaskStitcher`      | Stitches tile masks into a full-resolution mask.                              |
+| `PlotGenerator`        | Generates overlays and comparisons between the original image and the segmented mask. |
+### 5. **Code Snippets**
+```python
+# Example of Cellpose Batch Processor usage
+processor = CellposeBatchProcessor(
+    input_dir=PNG_IMAGES_DIR,
+    output_dir=CELLPOSE_MASKS_DIR,
+    model_name="cyto3_restore",
+    bsize=1024,
+    overlap=0.15,
+    batch_size=6,
+    gpu=0,
+    channels=(2, 0),
+    diameter=CELL_DIAMETER
+)
+processor.process_all()
+```
+```python
+# Example of Image Splitter usage
+splitter = ImageSplitter(source_dir=PNG_IMAGES_DIR, output_dir=SPLIT_IMAGES_DIR, sub_image_width=640, sub_image_height=640)
+splitter.split_all()
+```
+### 6. **Source Citations**
+- **Cellpose Batch Processor**: [model/run_cellpose.py:123-145]()
+- **Image Splitter**: [utils/generate_split_images.py:89-105]()
+- **Mask Stitcher**: [utils/generate_masks.py:123-145]()
+- **Plot Generator**: [utils/generate_plots.py:123-145]()
+- **Tiff To Png Converter**: [utils/generate_pngs.py:89-105]()
+### 7. **Conclusion**
+The model integration component of the spinal cord cell segmentation project provides a robust, end-to-end pipeline for processing and segmenting spinal cord images. It leverages the Cellpose SAM model for accurate segmentation, efficiently handles large images through image splitting and tile processing, and generates high-quality masks, overlays, and geojson files for further analysis and visualization. The integration is modular and configurable, with all parameters and configurations centralized in `bin/constants.py`.
+---
+<a id='page-8'></a>
+## Deployment/Infrastructure
+<details>
+<summary>Relevant source files</summary>
+- bin/constants.py
+- model/run_cellpose.py
+- utils/generate_split_images.py
+- utils/generate_masks.py
+- utils/generate_pngs.py
+</details>
+# Deployment/Infrastructure
+This section provides an overview of the deployment and infrastructure architecture of the spinal cord cell segmentation pipeline. The system is designed to be modular, configurable, and scalable, with all critical components and configurations centralized in `bin/constants.py`.
+## Architecture Overview
+The deployment infrastructure is built around a modular workflow that includes several key components:
+1. **Main Pipeline**:
+   - The `main.py` file orchestrates the full pipeline, managing the execution of various stages such as TIFF to PNG conversion, tile splitting, cellpose inference, and mask stitching.
+2. **Configuration Management**:
+   - The `bin/constants.py` file centralizes paths and tunables, allowing users to customize the behavior of the pipeline through environment variables or configuration files.
+3. **Component Dependencies**:
+   - The pipeline relies on several external dependencies, including `cellpose`, `opencv-python`, `numpy`, `pillow`, and `tifffile`, which are automatically installed via `requirements.txt`.
+4. **Execution Flow**:
+   - The pipeline is designed to run in a single command, with all necessary steps (TIFF → PNG conversion, tile splitting, cellpose inference, mask stitching) executed in a coordinated manner.
+## Key Components and Architecture
+### 1. Pipeline Execution
+The pipeline is structured as a series of steps that are executed in a specific order:
+```mermaid
+flowchart TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+Each step is responsible for a specific task:
+- **generate_pngs.py**: Converts TIFF images to PNG format with optional downscaling.
+- **generate_split_images.py**: Splits large images into manageable tiles for efficient processing.
+- **run_cellpose.py**: Wraps Cellpose API for inference on each tile.
+- **generate_masks.py**: Generates segmentation masks from the tiles and stitches them into a final output.
+### 2. Configuration and Tuning
+All paths, tile overlap, and Cellpose parameters are configurable in `bin/constants.py`. Key configurations include:
+- `IMG_HEIGHT`, `IMG_WIDTH`: Dimensions of the output images.
+- `SCALING_FACTOR`: Scaling factor for downscaling TIFF images.
+- `CELL_DIAMETER`: Diameter for cell segmentation.
+- `TILE_H`, `TILE_W`: Size of the tiles used for processing.
+### 3. Execution Flow and Dependencies
+The pipeline is designed to be run in a single command, with all necessary steps executed in a coordinated manner. The `main.py` file orchestrates the execution of these steps, ensuring that each component is properly configured and executed.
+### 4. Data Flow and Processing
+The pipeline processes images in the following order:
+1. **TIFF to PNG Conversion**:
+   - `generate_pngs.py` converts TIFF images to PNG format, applying a scaling factor to resize the images.
+2. **Tile Splitting**:
+   - `generate_split_images.py` splits large images into smaller tiles for efficient processing.
+3. **Cellpose Inference**:
+   - `run_cellpose.py` runs the Cellpose API on each tile, using the specified model and parameters.
+4. **Mask Stitching**:
+   - `generate_masks.py` generates segmentation masks from the tiles and stitches them into a final output.
+### 5. Key Functions and Classes
+- **CellposeBatchProcessor**:
+  - A class that batch-processes images with Cellpose, saving outputs in designated directories.
+  - It handles the execution of the pipeline steps, including TIFF to PNG conversion, tile splitting, cellpose inference, and mask stitching.
+- **ImageSplitter**:
+  - A class that splits PNG images into sub-images, which is used in the `generate_split_images.py` file.
+- **MaskStitcher**:
+  - A class that stitches individual tile masks into a single, full-resolution segmentation mask.
+- **OverlayGenerator**:
+  - A class that generates overlays and comparisons between the original images and the segmented masks.
+### 6. Mermaid Diagrams
+```mermaid
+graph TD
+    A[TIFF images] --> B[generate_pngs.py]:::step
+    B --> C[generate_split_images.py]:::step
+    C --> D[run_cellpose.py]:::step
+    D --> E[generate_masks.py]:::step
+    E --> F[Final segmentation]:::step
+```
+This diagram shows the flow of the pipeline from input images to the final segmentation output.
+### 7. Tables
+| Component | Description |
+|----------|-------------|
+| `bin/constants.py` | Centralizes paths and tunables for the pipeline. |
+| `model/run_cellpose.py` | Wraps Cellpose API for inference on each tile. |
+| `utils/generate_split_images.py` | Splits PNG images into sub-images. |
+| `utils/generate_masks.py` | Generates segmentation masks and stitches them. |
+| `utils/generate_pngs.py` | Converts TIFF images to PNG format. |
+### 8. Code Snippets
+```python
+# Example of a configuration in bin/constants.py
+IMG_HEIGHT, IMG_WIDTH = 640, 640
+SCALING_FACTOR = 0.2125
+CELL_DIAMETER = 30.0
+```
+```python
+# Example of a pipeline execution in main.py
+python main.py
+```
+### 9. Source Citations
+- `bin/constants.py: 12-15` (Configuration values)
+- `model/run_cellpose.py: 10-15` (Pipeline execution logic)
+- `utils/generate_split_images.py: 10-15` (Tile splitting logic)
+- `utils/generate_masks.py: 10-15` (Mask stitching logic)
+- `utils/generate_pngs.py: 10-15` (TIFF to PNG conversion logic)
+---
+<a id='page-9'></a>
+## Extensibility and Customization
+<details>
+<summary>Relevant source files</summary>
+- README.md
+- utils/generate_training_split_img_masks.py
+- utils/generate_training_data.py
+- utils/generate_split_images.py
+- utils/generate_pngs.py
+</details>
+# Extensibility and Customization
+This page explores the extensibility and customization capabilities of the spinal cord cell segmentation pipeline, focusing on how the project allows for flexible configuration, integration with external models, and modular architecture.
+## Introduction
+The project is designed to be highly extensible, allowing users to customize various aspects of the segmentation pipeline. This includes configuring model parameters, adjusting image processing steps, and integrating with external tools or models. The architecture is built around modular components that can be easily extended or replaced without disrupting the overall workflow.
+## Detailed Sections
+### 1. Modular Architecture
+The pipeline is structured around modular components that can be independently configured or replaced. Key components include:
+- **Main Pipeline**: Orchestrates the full workflow from TIFF conversion to final segmentation.
+- **Model Configuration**: Allows users to specify which Cellpose model to use (e.g., cyto3, cellpose, etc.) and adjust parameters like flow threshold, cell probability, and tile overlap.
+- **Image Processing**: Provides tools for splitting images into tiles, converting TIFFs to PNGs, and generating overlays for comparison.
+### 2. Model Customization
+The project supports custom model configurations through the `bin/constants.py` file. Users can:
+- Specify the path to a custom Cellpose model.
+- Adjust parameters like `flow_threshold`, `cellprob_threshold`, and `min_size`.
+- Configure how images are split into tiles and processed.
+### 3. Image Processing and Splitting
+The pipeline includes tools for image splitting and conversion:
+- **ImageSplitter**: Splits PNG images into sub-images and saves them to an output directory.
+- **TiffToPngConverter**: Converts TIFF images to PNG format with optional downscaling.
+- **generate_split_images.py**: Handles the splitting of large images into manageable tiles.
+### 4. Custom Model Integration
+The project allows for the integration of custom models through the `run_cellpose.py` file. Users can:
+- Replace the default `cyto3` model with any other Cellpose model.
+- Adjust model parameters and behavior through the `constants.py` file.
+### 5. Customization via Configuration
+The pipeline is highly configurable through the `bin/constants.py` file, which contains:
+- Centralized paths and tunables for image processing, model selection, and output directories.
+- Parameters like `scaling_factor`, `tile_overlap`, and `batch_size`.
+### 6. Extensibility via Plugins
+The project supports plugins and custom tools through the `utils/` directory. Users can:
+- Create custom tools for image processing, segmentation, or output generation.
+- Extend the pipeline by adding new steps or integrating with external tools.
+### 7. Mermaid Diagram
+```mermaid
+graph TD
+    A[Main Pipeline] --> B[Image Processing]
+    B --> C[TiffToPngConverter]
+    C --> D[ImageSplitter]
+    D --> E[run_cellpose.py]
+    E --> F[generate_masks.py]
+    F --> G[Final Segmentation]
+```
+### 8. Tables
+| Feature | Description |
+|--------|-------------|
+| Model Selection | Users can choose between different Cellpose models (e.g., cyto3, cellpose, etc.). |
+| Image Scaling | Allows for optional downscaling of images to speed up processing. |
+| Tile Configuration | Controls how images are split into tiles and processed. |
+| Output Directory | Specifies where the output files (PNGs, masks, etc.) are saved. |
+### 9. Code Snippets
+```python
+# Example of model configuration in constants.py
+model_path = "path/to/custom_model.pth"
+```
+```python
+# Example of image processing in generate_split_images.py
+sub_w = 512
+sub_h = 512
+```
+### 10. Source Citations
+- **Model Configuration**: `bin/constants.py` (line numbers not provided)
+- **Image Processing**: `utils/generate_split_images.py` (line numbers not provided)
+- **Pipeline Architecture**: `README.md` (line numbers not provided)
+- **Custom Model Integration**: `model/run_cellpose.py` (line numbers not provided)
+- **Configuration Options**: `bin/constants.py` (line numbers not provided)
+---

generate_training_data.py ADDED Viewed

	@@ -0,0 +1,48 @@

+# imports
+import logging, os
+from pathlib import Path
+from utils.constants import *
+from utils.generate_split_images import ImageSplitter
+from utils.generate_masks import MaskStitcher
+from utils.generate_pngs import TiffToPngConverter
+from model.run_cellpose import CellposeBatchProcessor
+from utils.generate_image_overlays import OverlayGenerator
+from utils.generate_training_dataset import *
+from utils.generate_training_split_img_masks import split_folder
+# constants
+TILE_H = TILE_W = 1024  # training tile size
+# generate - pngs
+setup_logging(logging.INFO)
+converter = TiffToPngConverter(scaling_factor=SCALING_FACTOR, tif_dir=TIF_IMAGES_DIR, output_dir=PNG_IMAGES_DIR)
+converter.convert_all()
+# generate - splits
+setup_logging(logging.INFO)
+splitter = ImageSplitter(source_dir=PNG_IMAGES_DIR, output_dir=SPLIT_IMAGES_DIR, sub_image_width=IMG_WIDTH, sub_image_height=IMG_HEIGHT)
+splitter.split_all()
+# generate - masks for training
+setup_logging(logging.INFO)
+os.makedirs(TRAIN_MASKS_DIR, exist_ok=True)
+for image in TIF_IMAGES_DIR.glob("*.tif"):
+    img_path = Path(image)
+    geojson_path = GEOJSON_DIR / (img_path.stem + ".geojson")
+    mask_png = TRAIN_MASKS_DIR / (img_path.stem + "_masks.tif")
+    geojson_to_mask_png(img_path, geojson_path, mask_png)
+    label_mask = np.array(Image.open(mask_png), dtype=np.uint16)
+    make_bw_preview(label_mask, mask_png.with_name(img_path.stem + "_mask_bw.png"))
+    make_colored_preview(label_mask, mask_png.with_name(img_path.stem + "_mask_color.png"))
+# generate - split images and masks for training
+setup_logging(logging.INFO)
+split_folder(TIF_IMAGES_DIR, TRAIN_MASKS_DIR, TRAIN_SPLIT_IMG_MASKS_DIR, TILE_H, TILE_W)
+# img_path     = Path("/Users/discovery/Downloads/SP24_008_2.ome.tif")
+# geojson_path = Path("/Users/discovery/Downloads/SP24_088_2.geojson")
+# mask_png     = Path("/Users/discovery/Downloads/SP24_008_2_mask.tif")

main.py ADDED Viewed

	@@ -0,0 +1,60 @@

+# imports
+import logging
+from pathlib import Path
+from utils.constants import *
+from utils.generate_plots import PlotGenerator
+from utils.generate_split_images import ImageSplitter
+from utils.generate_masks import MaskStitcher
+from utils.generate_combine_masks import NPYMaskStitcher
+from utils.generate_pngs import TiffToPngConverter
+from model.run_cellpose import CellposeBatchProcessor
+from utils.generate_image_overlays import OverlayGenerator
+from model.run_cellpose_sam import cellpose_sam_detect_images_eval
+from utils.generate_geojson_qp_mask import MaskToGeoJSONConverter
+# generate - pngs
+setup_logging(logging.INFO)
+converter = TiffToPngConverter(scaling_factor=SCALING_FACTOR, tif_dir=TIF_IMAGES_DIR, output_dir=PNG_IMAGES_DIR)
+converter.convert_all()
+# generate - splits
+setup_logging(logging.INFO)
+splitter = ImageSplitter(source_dir=PNG_IMAGES_DIR, output_dir=SPLIT_IMAGES_DIR, sub_image_width=IMG_WIDTH, sub_image_height=IMG_HEIGHT)
+splitter.split_all()
+# generate - cellpose masks (detect step using a pre-trained model)
+setup_logging(logging.INFO)
+cellpose_sam_detect_images_eval(model_path=MODEL, image_input_dir=SPLIT_IMAGES_DIR, image_output_dir=CELLPOSE_MASKS_DIR)
+# generate - stitched masks (.npy files)
+setup_logging(logging.INFO)
+stitcher = NPYMaskStitcher(input_dir=CELLPOSE_MASKS_DIR, output_dir=STITCHED_MASKS_DIR)
+stitcher.stitch_all()
+# generate - plots
+setup_logging(logging.INFO)
+plotter = PlotGenerator(image_dir=PNG_IMAGES_DIR, mask_dir=STITCHED_MASKS_DIR, output_dir=OUTPUT_DIR, overlay_color=(238,144,144), boundary_color=(100,100,255), alpha=0.5)
+plotter.run()
+# generate - geojsons
+setup_logging(logging.INFO)
+converter = MaskToGeoJSONConverter(mask_dir=STITCHED_MASKS_DIR, output_dir=GEOJSON_OUTS_DIR, upscale_factor=SCALING_FACTOR)
+converter.convert_all()
+########## archived code ##########
+# # cellpose - masks
+# setup_logging(logging.INFO)
+# processor = CellposeBatchProcessor(input_dir=SPLIT_IMAGES_DIR, output_dir=CELLPOSE_MASKS_DIR, model_name="",
+#                                    bsize=640, overlap=0.15, batch_size=6, gpu=0, channels=(2,0), diameter=CELL_DIAMETER)
+# processor.process_all()
+# # stitch - masks
+# setup_logging(logging.INFO)
+# stitcher = MaskStitcher(input_dir=CELLPOSE_MASKS_DIR, output_dir=STITCHED_MASKS_DIR)
+# stitcher.stitch_all()
+# # img-mask overlay & comparison
+# overlay_gen = OverlayGenerator(original_dir = PNG_IMAGES_DIR, mask_dir = STITCHED_MASKS_DIR, output_dir = OUTPUT_DIR, mask_color = (255, 0, 0), alpha = 0.5)
+# overlay_gen.run()

model/__init__.py ADDED Viewed

File without changes

model/run_cellpose.py ADDED Viewed

	@@ -0,0 +1,104 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+Batch-process .png or .tif slides with Cellpose, saving masks, previews, and segmentation arrays
+into designated directories.
+"""
+# imports
+from PIL import Image
+from pathlib import Path
+from cellpose import models, io
+from typing import Tuple, Union
+from cellpose import plot as cplt
+import os, numpy as np, logging, matplotlib.pyplot as plt
+# local imports
+from utils.constants import *
+class CellposeBatchProcessor:
+    """
+    Batch-process a directory of images with Cellpose,
+    saving outputs in MASKS_DIR, PREVIEW_DIR, and SEGMENTATION_DIR.
+    """
+    def __init__(self, input_dir: Union[str, Path], output_dir: Union[str, Path], model_name: str = "cyto3_restore",
+                 bsize: int = 2048, overlap: float = 0.15, batch_size: int = 6, gpu: int = 0, channels: Tuple[int, int] = (1, 0), diameter: int = 50) -> None:
+        self.input_dir = Path(input_dir)
+        self.output_dir = Path(output_dir)
+        self.model_name = model_name
+        self.bsize = bsize
+        self.overlap = overlap
+        self.batch_size = batch_size
+        self.gpu = gpu
+        self.channels = list(channels)
+        self.diameter = diameter
+        if self.gpu >= 0:
+            os.environ["CUDA_VISIBLE_DEVICES"] = str(self.gpu)
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self.model = models.CellposeModel(gpu=(self.gpu >= 0), pretrained_model=self.model_name)
+        self.output_dir.mkdir(parents=True, exist_ok=True)
+        for sub in (MASKS_DIR, PREVIEW_DIR, SEGMENTATION_DIR):
+            dir_path = self.output_dir / sub
+            dir_path.mkdir(parents=True, exist_ok=True)
+            self.logger.debug(f"Created output directory: {dir_path}")
+    def process_all(self) -> None:
+        """
+        Find all .png/.tif images in input_dir and process each.
+        """
+        img_paths = sorted(
+            list(self.input_dir.glob("*.png")) + list(self.input_dir.glob("*.tif"))
+        )
+        if not img_paths:
+            self.logger.warning(f"No images found in {self.input_dir}")
+            return
+        self.logger.info(f"Found {len(img_paths)} images in {self.input_dir}")
+        for img_path in img_paths:
+            try:
+                self._process_image(img_path)
+            except Exception:
+                self.logger.exception(f"Failed processing {img_path.name}")
+    def _process_image(self, img_path: Path) -> None:
+        """
+        Process a single image: segment, save masks, preview, and numpy array.
+        """
+        stem = img_path.stem
+        self.logger.info(f"Processing: {img_path.name}")
+        img = io.imread(str(img_path))
+        if img.ndim == 3 and img[:, :, self.channels[0]].max() == 0:
+            self.logger.warning(f"Channel {self.channels[0]} empty — skipping {img_path.name}")
+            return
+        masks, flows, styles = self.model.eval(
+            img,
+            channels=self.channels,
+            diameter=self.diameter,
+            bsize=self.bsize,
+            tile_overlap=self.overlap,
+            batch_size=self.batch_size,
+            resample=False
+        )
+        fig = plt.figure(figsize=(12, 5))
+        cplt.show_segmentation(fig, img, masks, flows[0], channels=self.channels)
+        plt.tight_layout()
+        preview_path = self.output_dir / PREVIEW_DIR / f"{stem}.png"
+        fig.savefig(preview_path, dpi=150, bbox_inches="tight")
+        plt.close(fig)
+        self.logger.info(f"Saved preview: {preview_path}")
+        mask_path = self.output_dir / MASKS_DIR / f"{stem}.png"
+        Image.fromarray(masks.astype("uint16")).save(mask_path)
+        self.logger.info(f"Saved mask: {mask_path}")
+        seg_path = self.output_dir / SEGMENTATION_DIR / f"{stem}.npy"
+        np.save(seg_path, masks)
+        self.logger.info(f"Saved segmentation array: {seg_path}")

model/run_cellpose_sam.py ADDED Viewed

	@@ -0,0 +1,34 @@

+# imports
+from pathlib import Path
+import os, re, numpy as np
+from cellpose import models
+from skimage import io as skio
+from tqdm import tqdm
+def cellpose_sam_detect_images_eval(model_path, image_input_dir, image_output_dir, image_ext=".png", flow_threshold=0.9, cellprob_threshold=-6, min_size=1):
+    """
+    Detect images using Cellpose SAM.
+    Args:
+        model_path (str): Path to the Cellpose SAM model.
+        image_input_dir (Path): Directory containing the images.
+        image_output_dir (Path): Directory to save the masks.
+        image_ext (str): Image file extension.
+        flow_threshold (float): Flow threshold for Cellpose SAM.
+        cellprob_threshold (float): Cell probability threshold for Cellpose SAM.
+        min_size (int): Minimum size for Cellpose SAM.
+    """
+    print(image_output_dir)
+    image_files = [f for f in image_input_dir.glob("*"+image_ext) if "_masks" not in f.name and "_flows" not in f.name]
+    model = models.CellposeModel(gpu=True, pretrained_model=model_path)
+    os.makedirs(image_output_dir, exist_ok=True)
+    for image_file in tqdm(image_files, desc="Segmenting images"):
+        image_path = os.path.join(image_input_dir, image_file)
+        img = skio.imread(image_path)
+        masks, flows, styles = model.eval([img], batch_size = 16, flow_threshold=flow_threshold, cellprob_threshold=cellprob_threshold, augment=True, resample=True, min_size=min_size)
+        mask = masks[0]
+        base_name = Path(image_file).stem
+        mask_path = os.path.join(image_output_dir, f"{base_name}.npy")
+        np.save(mask_path, mask)

notebooks/trained_model_prediction.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

pyproject.toml ADDED Viewed

	@@ -0,0 +1,14 @@

+[tool.poetry]
+name = "auto_segmenter"
+version = "0.1.0"
+description = "Automated cell segmentation pipeline"
+authors = ["Nikhil Nageshwar Inturi <inturinikhilnageshwar@gmail.com>"]
+[tool.poetry.dependencies]
+python = "^3.8"
+# list your runtime deps here:
+streamlit = "^1.25.0"
+[build-system]
+requires = ["poetry-core"]
+build-backend = "poetry.core.masonry.api"

requirements.txt ADDED Viewed

	@@ -0,0 +1,81 @@

+altair==5.5.0
+attrs==25.3.0
+blinker==1.9.0
+cachetools==5.5.2
+cellpose==4.0.4
+certifi==2025.4.26
+charset-normalizer==3.4.2
+click==8.2.1
+contourpy==1.3.2
+cycler==0.12.1
+fastremap==1.16.1
+filelock==3.18.0
+fill_voids==2.0.8
+fonttools==4.58.2
+fsspec==2025.5.1
+gitdb==4.0.12
+GitPython==3.1.44
+idna==3.10
+imagecodecs==2025.3.30
+imageio==2.37.0
+Jinja2==3.1.6
+jsonschema==4.24.0
+jsonschema-specifications==2025.4.1
+kiwisolver==1.4.8
+lazy_loader==0.4
+MarkupSafe==3.0.2
+matplotlib==3.10.3
+mpmath==1.3.0
+narwhals==1.42.1
+natsort==8.4.0
+networkx==3.5
+numpy==2.3.0
+nvidia-cublas-cu12==12.6.4.1
+nvidia-cuda-cupti-cu12==12.6.80
+nvidia-cuda-nvrtc-cu12==12.6.77
+nvidia-cuda-runtime-cu12==12.6.77
+nvidia-cudnn-cu12==9.5.1.17
+nvidia-cufft-cu12==11.3.0.4
+nvidia-cufile-cu12==1.11.1.6
+nvidia-curand-cu12==10.3.7.77
+nvidia-cusolver-cu12==11.7.1.2
+nvidia-cusparse-cu12==12.5.4.2
+nvidia-cusparselt-cu12==0.6.3
+nvidia-nccl-cu12==2.26.2
+nvidia-nvjitlink-cu12==12.6.85
+nvidia-nvtx-cu12==12.6.77
+opencv-python-headless==4.11.0.86
+packaging==24.2
+pandas==2.3.0
+pillow==11.2.1
+protobuf==6.31.1
+pyarrow==20.0.0
+pydeck==0.9.1
+pyparsing==3.2.3
+python-dateutil==2.9.0.post0
+pytz==2025.2
+referencing==0.36.2
+requests==2.32.4
+roifile==2025.5.10
+rpds-py==0.25.1
+scikit-image==0.25.2
+scipy==1.15.3
+segment-anything==1.0
+setuptools==80.9.0
+six==1.17.0
+smmap==5.0.2
+streamlit==1.45.1
+sympy==1.14.0
+tenacity==9.1.2
+tifffile==2025.6.11
+toml==0.10.2
+torch==2.7.1
+torchvision==0.22.1
+tornado==6.5.1
+tqdm==4.67.1
+triton==3.3.1
+typing_extensions==4.14.0
+tzdata==2025.2
+urllib3==2.4.0
+watchdog==6.0.0
+wheel==0.45.1

streamlit_app.py ADDED Viewed

	@@ -0,0 +1,74 @@

+#!/usr/bin/env python3
+"""
+Streamlit front-end for the Cellpose automation pipeline.
+Allows uploading a TIF, runs conversion → split → cellpose → stitching → overlay/comparison → geojson,
+then displays results and provides download links.
+"""
+# imports
+import streamlit as st, logging, shutil
+from PIL import Image
+from pathlib import Path
+from utils.constants import *
+from utils.generate_plots import PlotGenerator
+from utils.generate_split_images import ImageSplitter
+from utils.generate_masks import MaskStitcher
+from utils.generate_combine_masks import NPYMaskStitcher
+from utils.generate_pngs import TiffToPngConverter
+from model.run_cellpose import CellposeBatchProcessor
+from utils.generate_image_overlays import OverlayGenerator
+from model.run_cellpose_sam import cellpose_sam_detect_images_eval
+from utils.generate_geojson_qp_mask import MaskToGeoJSONConverter
+dirs = [TIF_IMAGES_DIR, PNG_IMAGES_DIR, SPLIT_IMAGES_DIR, CELLPOSE_MASKS_DIR, STITCHED_MASKS_DIR, OUTPUT_DIR, GEOJSON_OUTS_DIR]
+st.title("Cellpose-sam for DRGs - Automated Pipeline")
+uploaded = st.file_uploader("Upload a TIFF image", type=["tif"])
+if uploaded:
+    for d in dirs:
+        p = Path(d)
+        if p.exists() and p.is_dir():
+            shutil.rmtree(p) # to refresh the directory
+        p.mkdir(parents=True, exist_ok=True)
+    tif_path = TIF_IMAGES_DIR / uploaded.name
+    with open(tif_path, "wb") as f:
+        f.write(uploaded.getbuffer())  # save TIFF
+    st.success(f"Saved input to {tif_path}")
+    stem = tif_path.stem
+    # generate - pngs
+    with st.spinner("Converting TIFF to PNG..."):
+        TiffToPngConverter(scaling_factor=SCALING_FACTOR, tif_dir=TIF_IMAGES_DIR, output_dir=PNG_IMAGES_DIR).convert_all()
+    # generate - splits
+    with st.spinner("Splitting PNG into tiles..."):
+        ImageSplitter(source_dir=PNG_IMAGES_DIR, output_dir=SPLIT_IMAGES_DIR, sub_image_width=IMG_WIDTH, sub_image_height=IMG_HEIGHT).split_all()
+    # generate - cellpose masks (detect step using a pre-trained model)
+    with st.spinner("Running Cellpose segmentation..."):
+        cellpose_sam_detect_images_eval(model_path=MODEL, image_input_dir=SPLIT_IMAGES_DIR, image_output_dir=CELLPOSE_MASKS_DIR)
+    # generate - stitched masks (.npy files)
+    with st.spinner("Stitching masks..."):
+        NPYMaskStitcher(input_dir=CELLPOSE_MASKS_DIR, output_dir=STITCHED_MASKS_DIR).stitch_all()
+    # generate - plots
+    with st.spinner("Generating overlays and comparisons..."):
+        PlotGenerator(image_dir=PNG_IMAGES_DIR, mask_dir=STITCHED_MASKS_DIR, output_dir=OUTPUT_DIR, overlay_color=(238,144,144), boundary_color=(100,100,255), alpha=0.5).run()
+    # generate - geojsons
+    with st.spinner("Generating GeoJSON files..."):
+        MaskToGeoJSONConverter(mask_dir=STITCHED_MASKS_DIR, output_dir=GEOJSON_OUTS_DIR, upscale_factor=SCALING_FACTOR).convert_all()
+    st.success("Pipeline complete!")
+    # download buttons
+    st.header("Download segmentation masks")
+    geojson_file = GEOJSON_OUTS_DIR / f"{stem}.geojson"
+    if geojson_file.exists():
+        st.download_button(label="Download .geojson mask", data=open(geojson_file, "rb"), file_name=geojson_file.name)
+    overlay_file = OUTPUT_DIR / f"{stem}_overlay.png"
+    if overlay_file.exists():
+        st.image(Image.open(overlay_file), caption="{stem} - overlay", use_column_width=True)
+else:
+    st.info("Please upload a TIFF image to begin.")

utils/__init__.py ADDED Viewed

File without changes

utils/constants.py ADDED Viewed

	@@ -0,0 +1,41 @@

+# imports
+import logging
+from pathlib import Path
+from typing import Union
+def setup_logging(level: Union[int, str] = logging.INFO) -> None:
+    """
+    Configure the root logger format and level.
+    """
+    logging.basicConfig(format="%(asctime)s [%(levelname)s] %(name)s: %(message)s", level=level)
+# setup_logging - usage
+# from constants import setup_logging
+# setup_logging(logging.INFO)
+cp_sam_model = "/mnt/WorkingDos/cellpose_sam/models/cp_sam_hdrg_topoint_model"
+MODEL = cp_sam_model  # "cyto3_restore"  # "/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/train/models/cellpose_1746568542.462492"  # "cyto3_restore"
+SCALING_FACTOR = 0.2125  # 0.10625  # 0.2125
+IMG_HEIGHT, IMG_WIDTH = 1024, 1024  # 640, 640
+CELL_DIAMETER = 30.0
+# CONFIG_DIR = Path('/Users/discovery/Downloads/xenium_testing_jit/ish_hDGR_samples_fr')
+CONFIG_DIR = Path('/mnt/WorkingDos/cellpose_sam/spinal_cord_segmentation/data')
+TIF_IMAGES_DIR = CONFIG_DIR / '1_tif_images'
+PNG_IMAGES_DIR = CONFIG_DIR / '2_png_images'
+SPLIT_IMAGES_DIR = CONFIG_DIR / '3_split_images'
+CELLPOSE_MASKS_DIR = CONFIG_DIR / '4_cellpose_masks'
+STITCHED_MASKS_DIR = CONFIG_DIR / '5_stitched_masks'
+OUTPUT_DIR = CONFIG_DIR / '6_output_masks'
+TRAIN_MASKS_DIR = CONFIG_DIR / '7_train_masks'
+TRAIN_SPLIT_IMG_MASKS_DIR = CONFIG_DIR / '8_train_split_img_masks'
+GEOJSON_OUTS_DIR = CONFIG_DIR / '9_geojson_outs'
+GEOJSON_DIR = Path('/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/geojsons_dir')  # training param
+MASKS_DIR = 'masks'
+PREVIEW_DIR = 'preview'
+SEGMENTATION_DIR = 'segmentation'

utils/generate_combine_masks.py ADDED Viewed

	@@ -0,0 +1,172 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+This module provides MaskStitcher for stitching tiled .npy masks
+back into full-size masks, one per original image stem.
+"""
+import re
+from pathlib import Path
+import numpy as np
+import logging
+class NPYMaskStitcher:
+    """
+    Scans an input directory for files matching
+    <stem>_<row>_<col>.npy, groups them by stem, and
+    stitches each group into a single full-size mask.
+    """
+    TILE_PATTERN = re.compile(r'^(?P<stem>.+)_(?P<row>\d+)_(?P<col>\d+)\.npy$')
+    def __init__(self, input_dir: Path, output_dir: Path) -> None:
+        self.input_dir = Path(input_dir)
+        self.output_dir = Path(output_dir)
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self._setup_output_directory()
+    def _setup_output_directory(self) -> None:
+        try:
+            self.output_dir.mkdir(parents=True, exist_ok=True)
+            self.logger.debug(f"Output directory ready: {self.output_dir}")
+        except Exception as e:
+            self.logger.error(f"Could not create output directory {self.output_dir}: {e}")
+            raise
+    def stitch_all(self) -> None:
+        """
+        Find all .npy tiles, group by stem, and stitch each group.
+        """
+        all_files = list(self.input_dir.glob("*.npy"))
+        if not all_files:
+            self.logger.warning(f"No .npy files found in {self.input_dir}")
+            return
+        # group files by stem
+        stems = {}
+        for p in all_files:
+            m = self.TILE_PATTERN.match(p.name)
+            if not m:
+                self.logger.warning(f"Skipping unrecognized file name: {p.name}")
+                continue
+            stem = m.group("stem")
+            stems.setdefault(stem, []).append(p)
+        for stem, paths in stems.items():
+            try:
+                self._stitch_stem(stem, paths)
+                self.logger.info(f"Stitched mask for '{stem}' → {stem}.npy")
+            except Exception:
+                self.logger.exception(f"Failed to stitch tiles for '{stem}'")
+    def _stitch_stem(self, stem: str, paths: list[Path]) -> None:
+        """
+        Given all tile paths for a single stem, reconstruct the full mask.
+        """
+        # load each tile into a dict keyed by (row, col)
+        mask_map = {}
+        rows = set()
+        cols = set()
+        for p in paths:
+            m = self.TILE_PATTERN.match(p.name)
+            row, col = int(m.group("row")), int(m.group("col"))
+            tile = np.load(p)
+            mask_map[(row, col)] = tile
+            rows.add(row)
+            cols.add(col)
+        all_rows = sorted(rows)
+        all_cols = sorted(cols)
+        # determine max height per row, max width per col
+        row_heights = {r: max(mask_map[(r, c)].shape[0]
+                              for c in all_cols if (r, c) in mask_map)
+                       for r in all_rows}
+        col_widths = {c: max(mask_map[(r, c)].shape[1]
+                             for r in all_rows if (r, c) in mask_map)
+                      for c in all_cols}
+        # compute offsets
+        row_offsets = {r: sum(row_heights[rr] for rr in all_rows if rr < r)
+                       for r in all_rows}
+        col_offsets = {c: sum(col_widths[cc] for cc in all_cols if cc < c)
+                       for c in all_cols}
+        # total dims
+        total_h = sum(row_heights.values())
+        total_w = sum(col_widths.values())
+        # create canvas
+        full_mask = np.zeros((total_h, total_w), dtype=np.uint16)
+        # place tiles
+        for (r, c), tile in mask_map.items():
+            y0, x0 = row_offsets[r], col_offsets[c]
+            h, w = tile.shape
+            full_mask[y0:y0+h, x0:x0+w] = tile
+        # save combined mask
+        out_path = self.output_dir / f"{stem}.npy"
+        np.save(out_path, full_mask)
+# # Path to mask files
+# mask_folder = image_dir  # update this
+# mask_files = [f for f in os.listdir(mask_folder) if f.endswith('.npy')]
+# # Pattern to extract row and column
+# pattern = re.compile(r'_(\d+)_(\d+)\.npy')
+# # Map to hold each mask and its (row, col)
+# mask_map = {}
+# row_col_set = set()
+# # Organize masks by (row, col)
+# for f in mask_files:
+#     match = pattern.search(f)
+#     if match:
+#         row = int(match.group(1))  # y
+#         col = int(match.group(2))  # x
+#         mask = np.load(os.path.join(mask_folder, f))
+#         mask_map[(row, col)] = mask
+#         row_col_set.add((row, col))
+# # Determine row and column counts
+# all_rows = sorted({r for r, _ in row_col_set})
+# all_cols = sorted({c for _, c in row_col_set})
+# # Build a lookup for tile dimensions per row/col
+# row_heights = {}
+# col_widths = {}
+# for row in all_rows:
+#     for col in all_cols:
+#         if (row, col) in mask_map:
+#             h, w = mask_map[(row, col)].shape
+#             row_heights[row] = max(row_heights.get(row, 0), h)
+#             col_widths[col] = max(col_widths.get(col, 0), w)
+# # Compute cumulative row/column positions
+# row_offsets = {r: sum(row_heights[rr] for rr in all_rows if rr < r) for r in all_rows}
+# col_offsets = {c: sum(col_widths[cc] for cc in all_cols if cc < c) for c in all_cols}
+# # Total dimensions
+# total_height = sum(row_heights[r] for r in all_rows)
+# total_width = sum(col_widths[c] for c in all_cols)
+# # Create blank canvas
+# combined_mask = np.zeros((total_height, total_width), dtype=np.uint16)
+# # Stitch masks into the full canvas
+# for (row, col), mask in mask_map.items():
+#     y = row_offsets[row]
+#     x = col_offsets[col]
+#     h, w = mask.shape
+#     combined_mask[y:y+h, x:x+w] = mask
+# # Save result
+# np.save('combined_full_mask_testing_model.npy', combined_mask)

utils/generate_geojson_qp_mask.py ADDED Viewed

	@@ -0,0 +1,74 @@

+# utils/mask_to_geojson.py
+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+This module converts full-size .npy mask files into GeoJSON polygon files,
+scaling coordinates back to the original image resolution using a scale factor.
+"""
+import json
+from pathlib import Path
+import numpy as np
+import cv2
+import logging
+class MaskToGeoJSONConverter:
+    """
+    Scans a directory of .npy masks, finds contours for each labeled region,
+    scales coordinates by upscale_factor, and writes out a GeoJSON file per mask.
+    """
+    def __init__(self, mask_dir: Path, output_dir: Path, upscale_factor: float = 1.0):
+        self.mask_dir = Path(mask_dir)
+        self.output_dir = Path(output_dir)
+        self.upscale = 1/(upscale_factor)
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self.output_dir.mkdir(parents=True, exist_ok=True)
+    def convert_all(self) -> None:
+        mask_files = list(self.mask_dir.glob("*.npy"))
+        if not mask_files:
+            self.logger.warning(f"No .npy mask files found in {self.mask_dir}")
+            return
+        for mask_fp in mask_files:
+            try:
+                self._convert_file(mask_fp)
+                self.logger.info(f"Converted {mask_fp.name} to GeoJSON")
+            except Exception:
+                self.logger.exception(f"Failed to convert {mask_fp.name}")
+    def _convert_file(self, mask_fp: Path) -> None:
+        mask = np.load(mask_fp)
+        labels = np.unique(mask)
+        labels = labels[labels != 0]
+        features = []
+        for label in labels:
+            binary = (mask == label).astype(np.uint8)
+            contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+            for cnt in contours:
+                coords = cnt.squeeze().tolist()
+                if len(coords) < 3:
+                    continue
+                # scale coordinates back to original resolution
+                scaled = [[int(x * self.upscale), int(y * self.upscale)] for [x, y] in coords]
+                if scaled[0] != scaled[-1]:
+                    scaled.append(scaled[0])
+                feature = {
+                    "type": "Feature",
+                    "properties": {"label": int(label)},
+                    "geometry": {
+                        "type": "Polygon",
+                        "coordinates": [scaled]
+                    }
+                }
+                features.append(feature)
+        geojson = {"type": "FeatureCollection", "features": features}
+        out_fp = self.output_dir / f"{mask_fp.stem}.geojson"
+        with open(out_fp, "w") as f:
+            json.dump(geojson, f)

utils/generate_image_overlays.py ADDED Viewed

	@@ -0,0 +1,61 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+Overlay original PNGs with their corresponding stitched masks,
+then generate side-by-side comparison mosaics.
+"""
+# imports
+from pathlib import Path
+from typing import Union
+from PIL import Image, ImageOps, ImageEnhance
+import os
+class OverlayGenerator:
+    """
+    For each image in original_dir, find matching mask in mask_dir,
+    create an overlay with transparency, and a side-by-side composite.
+    """
+    def __init__(self, original_dir: Union[str, Path], mask_dir: Union[str, Path], output_dir: Union[str, Path],
+    mask_color: tuple = (255, 0, 0), alpha: float = 0.8) -> None:
+        self.original_dir = Path(original_dir)
+        self.mask_dir = Path(mask_dir)
+        self.output_dir = Path(output_dir)
+        self.mask_color = mask_color
+        self.alpha = alpha
+        self.output_dir.mkdir(parents=True, exist_ok=True)
+    def run(self) -> None:
+        pngs = list(self.original_dir.glob("*.png"))
+        for orig_path in pngs:
+            stem = orig_path.stem
+            mask_path = self.mask_dir / f"{stem}_mask_stitched.png"
+            if not mask_path.exists():
+                print(f"Warning: mask not found for {stem}")
+                continue
+            self._make_overlay(orig_path, mask_path)
+            self._make_comparison(orig_path, mask_path)
+    def _make_overlay(self, orig_path: Path, mask_path: Path) -> None:
+        orig = Image.open(orig_path).convert("RGBA")
+        mask = Image.open(mask_path).convert("L")
+        # if mask.size != orig.size:
+        #     mask = mask.resize(orig.size, resample=Image.NEAREST)
+        color_mask = Image.new("RGBA", orig.size, self.mask_color + (0,))
+        color_mask.putalpha(ImageEnhance.Brightness(mask).enhance(self.alpha))
+        overlay = Image.alpha_composite(orig, color_mask)
+        out_path = self.output_dir / f"{orig_path.stem}_overlay.png"
+        overlay.save(out_path)
+    def _make_comparison(self, orig_path: Path, mask_path: Path) -> None:
+        orig = Image.open(orig_path).convert("RGB")
+        mask = Image.open(mask_path).convert("RGB")
+        if orig.size != mask.size:
+            mask = mask.resize(orig.size, resample=Image.NEAREST)
+        comp = Image.new("RGB", (orig.width * 2, orig.height))
+        comp.paste(orig, (0, 0))
+        comp.paste(mask, (orig.width, 0))
+        out_path = self.output_dir / f"{orig_path.stem}_compare.png"
+        comp.save(out_path)

utils/generate_masks.py ADDED Viewed

	@@ -0,0 +1,109 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+Stitch tiled mask .npy files and mask PNGs into mosaics,
+based on a Cellpose output root (4_cellpose_masks) containing
+ 'segmentation' and 'masks' subfolders, ignoring 'preview'.
+"""
+# imports
+from PIL import Image
+from pathlib import Path
+import logging, numpy as np, tifffile
+# local imports
+from utils.constants import *
+class MaskStitcher:
+    """
+    Stitch both .npy masks and mask PNGs from a Cellpose output root:
+    - Expects root with subfolders SEGMENTATION_DIR (.npy) and MASKS_DIR (.png)
+    - Outputs mosaics in STITCHED_MASKS_DIR
+    """
+    def __init__(self, input_dir: Path, output_dir: Path = None) -> None:
+        self.input_dir = Path(input_dir)
+        self.seg_dir = self.input_dir / SEGMENTATION_DIR
+        self.png_dir = self.input_dir / MASKS_DIR
+        self.output_dir = Path(output_dir) if output_dir is not None else Path(STITCHED_MASKS_DIR)
+        self.output_dir.mkdir(parents=True, exist_ok=True)
+        self.logger = logging.getLogger(self.__class__.__name__)
+    @staticmethod
+    def _parse(fname: str):
+        stem = Path(fname).stem
+        base, row, col = stem.rsplit("_", 2)
+        return base, int(row), int(col)
+    def _read_npy(self, fp: Path) -> np.ndarray:
+        data = np.load(fp, allow_pickle=True)
+        if data.dtype == object:
+            data = data.item().get('masks')
+        return data.astype(np.int32, copy=False)
+    def _read_png(self, fp: Path) -> np.ndarray:
+        arr = np.array(Image.open(fp))
+        return arr.astype(np.int32, copy=False)
+    def _groups(self, directory: Path, pattern: str):
+        groups = {}
+        for fp in directory.glob(pattern):
+            base, _, _ = self._parse(fp.name)
+            groups.setdefault(base, []).append(fp)
+        return groups
+    def _layout(self, files, read_func):
+        row_h = {}
+        col_w = {}
+        for fp in files:
+            _, r, c = self._parse(fp.name)
+            h, w = read_func(fp).shape
+            row_h[r] = max(row_h.get(r, 0), h)
+            col_w[c] = max(col_w.get(c, 0), w)
+        y_off = {}
+        x_off = {}
+        y = x = 0
+        for r in sorted(row_h):
+            y_off[r] = y
+            y += row_h[r]
+        for c in sorted(col_w):
+            x_off[c] = x
+            x += col_w[c]
+        return y_off, x_off, y, x
+    def _stitch(self, files, read_func):
+        y_off, x_off, H, W = self._layout(files, read_func)
+        mosaic = np.zeros((H, W), dtype=np.int32)
+        next_lbl = 1
+        for fp in files:
+            _, r, c = self._parse(fp.name)
+            tile = read_func(fp)
+            for lbl in np.unique(tile)[1:]:
+                mask_region = (tile == lbl)
+                yy = y_off[r]
+                xx = x_off[c]
+                region = mosaic[yy:yy+tile.shape[0], xx:xx+tile.shape[1]]
+                region[mask_region] = next_lbl
+                next_lbl += 1
+        return mosaic
+    def stitch_all(self) -> None:
+        seg_groups = self._groups(self.seg_dir, "*.npy")
+        for base, files in seg_groups.items():
+            self.logger.info(f"Stitching segmentation for '{base}' ")
+            mosaic = self._stitch(files, self._read_npy)
+            out_npy = self.output_dir / f"{base}_stitched.npy"
+            np.save(out_npy, mosaic)
+            self.logger.info(f"Saved stitched .npy: {out_npy}")
+            out_tif = self.output_dir / f"{base}_stitched.tif"
+            tifffile.imwrite(out_tif, (mosaic>0).astype(np.uint8)*255, photometric="minisblack")
+            self.logger.info(f"Saved stitched TIFF: {out_tif}")
+        png_groups = self._groups(self.png_dir, "*.png")
+        for base, files in png_groups.items():
+            self.logger.info(f"Stitching mask PNGs for '{base}' ")
+            mosaic = self._stitch(files, self._read_png)
+            out_png = self.output_dir / f"{base}_mask_stitched.png"
+            Image.fromarray(mosaic.astype(np.uint16)).save(out_png)
+            self.logger.info(f"Saved stitched mask PNG: {out_png}")

utils/generate_metrics.py ADDED Viewed

File without changes

utils/generate_plots.py ADDED Viewed

	@@ -0,0 +1,175 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+This module provides PlotGenerator to process all masks in a directory:
+  - For each mask, find its image by matching name stem, then output:
+      1) a binary mask PNG,
+      2) an overlay PNG with colored mask + boundaries.
+"""
+import matplotlib.pyplot as plt
+import numpy as np
+from PIL import Image
+from skimage.segmentation import find_boundaries
+from pathlib import Path
+import logging
+class PlotGenerator:
+    """
+    Process every .npy mask in mask_dir and generate
+    corresponding plots using images from image_dir.
+    """
+    def __init__(
+        self,
+        image_dir: Path,
+        mask_dir: Path,
+        output_dir: Path,
+        overlay_color: tuple[int,int,int] = (238,144,144),
+        boundary_color: tuple[int,int,int] = (100,100,255),
+        alpha: float = 0.5
+    ) -> None:
+        self.image_dir = Path(image_dir)
+        self.mask_dir = Path(mask_dir)
+        self.output_dir = Path(output_dir)
+        self.overlay_color = np.array(overlay_color, dtype=np.uint8)
+        self.boundary_color = np.array(boundary_color, dtype=np.uint8)
+        self.alpha = alpha
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self.output_dir.mkdir(parents=True, exist_ok=True)
+    def run(self) -> None:
+        mask_paths = list(self.mask_dir.glob("*.npy"))
+        if not mask_paths:
+            self.logger.warning(f"No .npy masks found in {self.mask_dir}")
+            return
+        for mask_path in mask_paths:
+            stem = mask_path.stem
+            img_candidates = list(self.image_dir.glob(f"{stem}*.png"))
+            if not img_candidates:
+                self.logger.warning(f"No image found for mask '{stem}'")
+                continue
+            image_path = img_candidates[0]
+            img = np.array(Image.open(image_path).convert("RGB"))
+            mask = np.load(mask_path)
+            # binary mask plot
+            binary = (mask > 0).astype(np.uint8)
+            plt.figure(figsize=(10,10))
+            plt.imshow(binary, cmap='gray')
+            plt.axis('off')
+            plt.title(f"{stem} - Binary Mask")
+            out_gray = self.output_dir / f"{stem}_binary.png"
+            plt.savefig(out_gray, bbox_inches='tight', dpi=300)
+            plt.close()
+            self.logger.info(f"Saved binary mask plot: {out_gray.name}")
+            # overlay with boundaries
+            overlay = img.copy()
+            mask_bool = mask > 0
+            overlay[mask_bool] = (
+                (1 - self.alpha) * img[mask_bool] + self.alpha * self.overlay_color
+            ).astype(np.uint8)
+            boundaries = find_boundaries(mask_bool, mode='outer')
+            overlay[boundaries] = self.boundary_color
+            plt.figure(figsize=(10,10))
+            plt.imshow(overlay)
+            plt.axis('off')
+            plt.title(f"{stem} - Mask Overlay")
+            out_overlay = self.output_dir / f"{stem}_overlay.png"
+            plt.savefig(out_overlay, bbox_inches='tight', dpi=300)
+            plt.close()
+            self.logger.info(f"Saved overlay plot: {out_overlay.name}")
+# # main.py snippet (to run plots for all masks)
+# from utils.generate_plots import PlotGenerator
+# plotter = PlotGenerator(
+#     image_dir=PNG_IMAGES_DIR,
+#     mask_dir=STITCHED_MASKS_DIR,
+#     output_dir=OUTPUT_DIR,
+#     overlay_color=(238,144,144),
+#     boundary_color=(100,100,255),
+#     alpha=0.5
+# )
+# plotter.run()
+# import matplotlib.pyplot as plt
+# import numpy as np
+# # Create a binary mask (optional: if combined_mask contains labels like 1,2,3...)
+# binary_mask = (combined_mask > 0).astype(np.uint8)
+# plt.figure(figsize=(10, 10))
+# plt.imshow(binary_mask, cmap='gray')  # all non-zero values will be gray
+# plt.title('Combined Mask - Single Color')
+# plt.axis('off')
+# # Save the figure as PNG
+# plt.savefig('combined_mask_grey_testing_model.png', bbox_inches='tight', dpi=300)
+# # Show the plot
+# plt.show()
+# from PIL import Image
+# import numpy as np
+# image = Image.open("jayden_img.ome.png").convert("RGB")  # PNG_IMAGES_DIR = CONFIG_DIR / '2_png_images'
+# image_np = np.array(image)
+# mask = np.load("combined_full_mask_testing_model.npy")  # STITCHED_MASKS_DIR = CONFIG_DIR / '5_stitched_masks'
+# import numpy as np
+# import matplotlib.pyplot as plt
+# from PIL import Image
+# from skimage.segmentation import find_boundaries
+# # Define colors
+# overlay_color = np.array([238, 144, 144])  # Light green
+# boundary_color = np.array([100, 100, 255])     # Navy blue
+# alpha = 0.5  # Transparency for overlay
+# # Ensure mask is binary
+# mask = (mask > 0).astype(np.uint8)
+# # Create a copy for overlay
+# overlay = image_np.copy()
+# # Apply overlay color where mask is 1
+# overlay[mask == 1] = ((1 - alpha) * image_np[mask == 1] + alpha * overlay_color).astype(np.uint8)
+# # --- Add navy blue boundaries ---
+# from skimage.segmentation import find_boundaries
+# # Find boundaries in the mask
+# boundaries = find_boundaries(mask, mode='outer')
+# # Draw boundary color
+# overlay[boundaries] = boundary_color
+# # Show plot
+# plt.figure(figsize=(10, 10))
+# plt.imshow(overlay)
+# plt.axis("off")
+# plt.title("Image with Mask Overlay and Navy Blue Boundary")
+# plt.show()
+# # Save the image
+# output = Image.fromarray(overlay)
+# output.save("0_image_with_mask_overlay_with_white_boundary_model.png")
+# # output dir where pltos needs to be saved: OUTPUT_DIR = CONFIG_DIR / '6_output_masks'

utils/generate_pngs.py ADDED Viewed

	@@ -0,0 +1,74 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+This module provides TiffToPngConverter for converting TIFF images to PNG format,
+applying a scaling factor to resize the images.
+"""
+# imports
+import logging, tifffile
+from pathlib import Path
+from typing import Union
+from PIL import Image
+# local imports
+from utils.constants import *
+class TiffToPngConverter:
+    """
+    Converts all TIFF images in a source directory to PNG format,
+    applying a scaling factor to resize the images.
+    """
+    def __init__(self, scaling_factor: float, tif_dir: Union[str, Path], output_dir: Union[str, Path]) -> None:
+        self.scaling_factor = scaling_factor
+        self.tif_dir = Path(tif_dir)
+        self.output_dir = Path(output_dir)
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self._setup_output_directory()
+    def _setup_output_directory(self) -> None:
+        try:
+            self.output_dir.mkdir(parents=True, exist_ok=True)
+            self.logger.debug(f"Output directory ready: {self.output_dir}")
+        except Exception as e:
+            self.logger.error(f"Failed to create output directory {self.output_dir}: {e}")
+            raise
+    def convert_all(self) -> None:
+        """
+        Convert all .tif files in the source directory.
+        """
+        tif_files = list(self.tif_dir.glob("*.tif"))
+        if not tif_files:
+            self.logger.warning(f"No .tif files found in {self.tif_dir}")
+            return
+        for tif_path in tif_files:
+            try:
+                self.convert_file(tif_path)
+            except Exception:
+                self.logger.exception(f"Error converting file: {tif_path}")
+    def convert_file(self, tif_path: Path) -> None:
+        """
+        Convert a single TIFF file to PNG, resizing by the scaling factor.
+        Args:
+            tif_path: Path to the input .tif file.
+        """
+        img_array = tifffile.imread(str(tif_path), level=0)
+        self.logger.debug(f"Read {tif_path.name} with shape {img_array.shape}")
+        img = Image.fromarray(img_array)
+        new_size = (int(img_array.shape[1] * self.scaling_factor), int(img_array.shape[0] * self.scaling_factor))
+        img_resized = img.resize(new_size, resample=Image.LANCZOS)
+        output_path = self.output_dir / tif_path.with_suffix(".png").name
+        img_resized.save(output_path, format="PNG")
+        self.logger.info(f"Converted {tif_path.name} to {output_path}")
+# testing
+# setup_logging(logging.INFO)
+# converter = TiffToPngConverter(0.2125, 'path/to/tifs', 'path/to/output')
+# converter.convert_all()

utils/generate_split_images.py ADDED Viewed

	@@ -0,0 +1,93 @@

+#!/usr/bin/env python3
+"""
+Developed by Nikhil Nageshwar Inturi
+This module provides ImageSplitter for splitting PNG images into
+equal-sized sub-images and saving them to an output directory.
+"""
+# imports
+from pathlib import Path
+import numpy as np, cv2, logging
+from PIL import Image
+# local imports
+from utils.constants import setup_logging
+Image.MAX_IMAGE_PIXELS = None
+class ImageSplitter:
+    """
+    Splits all PNG images in a source directory into sub-images of specified width and height.
+    """
+    def __init__(self, source_dir: Path, output_dir: Path, sub_image_width: int, sub_image_height: int) -> None:
+        self.source_dir = Path(source_dir)
+        self.output_dir = Path(output_dir)
+        self.sub_w = sub_image_width
+        self.sub_h = sub_image_height
+        self.logger = logging.getLogger(self.__class__.__name__)
+        self._setup_output_directory()
+    def _setup_output_directory(self) -> None:
+        try:
+            self.output_dir.mkdir(parents=True, exist_ok=True)
+            self.logger.debug(f"Output directory ready: {self.output_dir}")
+        except Exception as e:
+            self.logger.error(f"Failed to create output directory {self.output_dir}: {e}")
+            raise
+    def split_all(self) -> None:
+        """
+        Iterate over all PNG files in source_dir and split them.
+        """
+        png_files = list(self.source_dir.glob("*.png"))
+        if not png_files:
+            self.logger.warning(f"No .png files found in {self.source_dir}")
+            return
+        for png_file in png_files:
+            try:
+                self.split_file(png_file)
+            except Exception:
+                self.logger.exception(f"Error splitting file: {png_file}")
+    def split_file(self, png_path: Path) -> None:
+        """
+        Split a single PNG image into sub-images.
+        Args:
+            png_path: Path to the input .png file.
+        """
+        with Image.open(png_path) as pil_img:
+            img = np.array(pil_img)
+        self.logger.debug(f"Loaded {png_path.name} with shape {img.shape}")
+        height, width = img.shape[:2]
+        cols = (width + self.sub_w - 1) // self.sub_w
+        rows = (height + self.sub_h - 1) // self.sub_h
+        for row in range(rows):
+            for col in range(cols):
+                x0 = col * self.sub_w
+                y0 = row * self.sub_h
+                x1 = min(x0 + self.sub_w, width)
+                y1 = min(y0 + self.sub_h, height)
+                sub_img = img[y0:y1, x0:x1]
+                output_name = f"{png_path.stem}_{row}_{col}.png"
+                output_path = self.output_dir / output_name
+                # success = cv2.imwrite(str(output_path), sub_img)
+                sub_img_bgr = cv2.cvtColor(sub_img, cv2.COLOR_RGB2BGR)
+                success = cv2.imwrite(str(output_path), sub_img_bgr)
+                if success:
+                    self.logger.info(f"Saved sub-image: {output_name}")
+                else:
+                    self.logger.error(f"Failed to save sub-image: {output_name}")
+# testing
+# from bin.constants import setup_logging
+# from bin.generate_split_images import ImageSplitter
+# setup_logging(logging.INFO)
+# splitter = ImageSplitter(source_dir=Path("path/to/pngs"), output_dir=Path("path/to/splits"), sub_image_width=640, sub_image_height=640)
+# splitter.split_all()

utils/generate_training_dataset.py ADDED Viewed

	@@ -0,0 +1,64 @@

+"""
+Create a uint16 mask PNG from a GeoJSON file and write two down‑sampled
+previews (black‑and‑white and random‑colour) for easy visual inspection.
+"""
+from pathlib import Path
+import warnings, numpy as np, geopandas as gpd, rasterio, cv2
+from shapely.geometry import Polygon, MultiPolygon
+from PIL import Image
+from tifffile import imread
+from utils.constants import *
+Image.MAX_IMAGE_PIXELS = None
+warnings.filterwarnings("ignore", category=Image.DecompressionBombWarning)
+def geojson_to_mask_png(image_path: Path, geojson_path: Path, mask_out: Path) -> None:
+    """
+    Rasterize GeoJSON polygons into a uint16 label mask PNG.
+    """
+    with rasterio.open(image_path) as src:
+        transform = src.transform
+        height, width = src.height, src.width
+        crs = src.crs
+    mask = np.zeros((height, width), dtype=np.uint16)
+    gdf = gpd.read_file(geojson_path)
+    if crs is not None and gdf.crs is not None and gdf.crs != crs:
+        gdf = gdf.to_crs(crs)
+    def world_to_px(tx, x, y):
+        c, r = ~tx * (x, y)
+        return int(round(c)), int(round(r))
+    for idx, geom in enumerate(gdf.geometry, start=1):
+        polys = geom.geoms if isinstance(geom, MultiPolygon) else [geom]
+        for poly in polys:
+            pts = np.array(
+                [world_to_px(transform, x, y) for x, y in poly.exterior.coords],
+                dtype=np.int32
+            )
+            cv2.fillPoly(mask, [pts], color=idx)
+    cv2.imwrite(str(mask_out), mask)
+def make_bw_preview(label_mask: np.ndarray, out_png: Path, downsample: int = 8):
+    preview = (label_mask > 0).astype(np.uint8) * 255
+    if downsample > 1:
+        preview = preview[::downsample, ::downsample]
+    Image.fromarray(preview, mode="L").save(out_png)
+    print(f"saved B/W preview → {out_png}")
+def make_colored_preview(label_mask: np.ndarray, out_png: Path, downsample: int = 8):
+    rng = np.random.default_rng(42)
+    lut = np.vstack(
+        [np.zeros((1, 3), np.uint8),
+         rng.integers(0, 256, size=(label_mask.max(), 3), dtype=np.uint8)]
+    )
+    rgb = lut[label_mask]
+    if downsample > 1:
+        rgb = rgb[::downsample, ::downsample]
+    Image.fromarray(rgb, mode="RGB").save(out_png)
+    print(f"saved colour preview → {out_png}")

utils/generate_training_split_img_masks.py ADDED Viewed

	@@ -0,0 +1,86 @@

+#!/usr/bin/env python3
+"""
+Split every TIFF in IMG_DIR and its matching mask in MASK_DIR into 640×640 or 1024x1024 tiles.
+Image tiles are saved as 8‑bit TIFFs: mask tiles are saved
+as 16‑bit TIFFs with “_mask” suffix so Cellpose can pair them automatically.
+Result:
+    OUT_DIR/
+        <stem>_0_0.tif
+        <stem>_0_0_mask.tif
+        <stem>_0_1.tif
+        <stem>_0_1_mask.tif
+        ...
+Author : Nikhil Nageshwar Inturi  (modified to handle separate img/mask dirs)
+"""
+# imports
+from pathlib import Path
+import logging, numpy as np, tifffile, cv2
+from utils.constants import setup_logging
+setup_logging(logging.INFO)
+# IMG_DIR = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/1_tif_images")
+# MASK_DIR = Path("/Users/discovery/Downloads/xenium_testing_jit/spinal_cord_samples_fr/7_mask_images")
+# OUT_DIR = IMG_DIR.parent / "8_split_masks"
+# TILE_H = TILE_W = 1024
+def read_tif(path: Path) -> np.ndarray:
+    arr = tifffile.imread(path)
+    # (C,H,W) → (H,W,C) if few channels
+    if arr.ndim == 3 and arr.shape[0] <= 4:
+        arr = np.transpose(arr, (1, 2, 0))
+    return arr
+def pad(tile: np.ndarray, th: int, tw: int) -> np.ndarray:
+    dh, dw = th - tile.shape[0], tw - tile.shape[1]
+    if dh == 0 and dw == 0:
+        return tile
+    pads = ((0, dh), (0, dw)) + ((0, 0),) * (tile.ndim - 2)
+    return np.pad(tile, pads, mode="constant", constant_values=0)
+def tile_pair(img_path: Path, mask_path: Path, out_dir: Path, TILE_H: int, TILE_W: int):
+    stem = img_path.stem
+    img = read_tif(img_path)
+    mask = read_tif(mask_path).astype(np.uint16)
+    if img.shape[:2] != mask.shape[:2]:
+        raise ValueError(f"Dimension mismatch {img_path.name} vs {mask_path.name}")
+    H, W = img.shape[:2]
+    nrows = (H + TILE_H - 1) // TILE_H
+    ncols = (W + TILE_W - 1) // TILE_W
+    for r in range(nrows):
+        for c in range(ncols):
+            y0, x0 = r * TILE_H, c * TILE_W
+            y1, x1 = min(y0 + TILE_H, H), min(x0 + TILE_W, W)
+            img_tile = pad(img[y0:y1, x0:x1], TILE_H, TILE_W)
+            msk_tile = pad(mask[y0:y1, x0:x1], TILE_H, TILE_W)
+            img_name = f"{stem}_{r}_{c}.tif"
+            msk_name = f"{stem}_{r}_{c}_masks.tif"
+            if img_tile.dtype != np.uint8:
+                img_write = cv2.normalize(
+                    img_tile, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
+            else:
+                img_write = img_tile
+            tifffile.imwrite(out_dir / img_name, img_write)
+            tifffile.imwrite(out_dir / msk_name, msk_tile)
+            logging.info("saved %s / %s", img_name, msk_name)
+def split_folder(img_dir: Path, mask_dir: Path, out_dir: Path, TILE_H: int, TILE_W: int):
+    out_dir.mkdir(parents=True, exist_ok=True)
+    for img_path in img_dir.glob("*.tif"):
+        if img_path.name.endswith("_masks.tif"):
+            continue
+        mask_path = mask_dir / f"{img_path.stem}_masks.tif"
+        if not mask_path.exists():
+            logging.warning("no mask found for %s, skipping", img_path.name)
+            continue
+        tile_pair(img_path, mask_path, out_dir, TILE_H, TILE_W)