roobee79
/

HNE2Cell

@@ -33,10 +33,10 @@ It takes 256×256 px patches as input and outputs per-cell contours, centroids,
 The full pipeline consists of three steps:
 ```
-┌───────────────┐       ┌────────────────┐      ┌───────────────────┐
-│ 1. Normalize  │ ──→  │  2. Patchify   │ ──→ │  3. Inference   │
-│   (Reinhard)  │      │ (256px, 64ov)  │     │ (Cell Detection)│
-└───────────────┘       └────────────────┘      └───────────────────┘
     SVS / TIF             PNG patches          Masks + Centroids
 ```
@@ -52,17 +52,85 @@ The full pipeline consists of three steps:
 ---
-## Quick Start
-### 1. Install dependencies
 ```bash
-pip install torch torchvision tifffile scikit-image pillow opencv-python-headless pandas tqdm
-# Optional (for .svs files):
-pip install openslide-python
 ```
-### 2. Download the model
 ```python
 from huggingface_hub import hf_hub_download
@@ -73,37 +141,12 @@ model_path = hf_hub_download(
 )
 ```
-### 3. Run the full pipeline
-```bash
-# Step 1: Color normalization (Reinhard method)
-python normalize.py \
-    --input_dir /path/to/slides \
-    --target /path/to/standard-ilc.tif
-# Step 2: Extract patches (40x recommended)
-python patchify.py \
-    --input_dir /path/to/slides \
-    --magnification 40 \
-    --patch_size 256 \
-    --overlap 64 \
-    --workers 8
-# Step 3: Cell detection & classification
-python inference.py \
-    --input_dir /path/to/patch_folders \
-    --output_dir /path/to/results \
-    --model_path ./HNE2cell_all_patch73_jit.pt \
-    --magnification 40 \
-    --batch_size 32
-```
 ---
-## Example: Reproducible Walkthrough
-To verify your installation, run the pipeline on the example slide included in this repository
-(`TCGA-56-8628-01Z-00-DX1`, LUSC, ~36 MB).
 ### Download the model, example slide, and reference image
@@ -167,13 +210,52 @@ example/results/
 └── patch_*_centroid.csv   # Cell centroids with type labels
 ```
-Approximate runtime on a single NVIDIA A100: **~20 min** for the full slide
 > The example slide is from **TCGA-LUSC** and is redistributed under the
 > [NIH Genomic Data Sharing Policy](https://sharing.nih.gov/genomic-data-sharing-policy).
 ---
 ## Input / Output Details
 ### Input
@@ -245,10 +327,10 @@ If you use HNE2Cell in your research, please cite:
 ```bibtex
 @misc{hne2cell,
   title={Spatial transcriptomics–supervised deep learning enables single-cell mapping of tumor immune architecture from routine histology},
-  year={2025},
   url={https://huggingface.co/roobee79/HNE2Cell}
 }
 ```
-The example slide is derived from data generated by the TCGA Research Network:
-<https://www.cancer.gov/tcga>.

 The full pipeline consists of three steps:
 ```
+┌───────────────┐      ┌────────────────┐     ┌──────────────────┐
+│ 1. Normalize  │ ──→  │  2. Patchify   │ ──→ │  3. Inference    │
+│   (Reinhard)  │      │ (256px, 64ov)  │     │ (Cell Detection) │
+└───────────────┘      └────────────────┘     └──────────────────┘
     SVS / TIF             PNG patches          Masks + Centroids
 ```
 ---
+## System Requirements
+### Software dependencies (tested versions)
+Core packages (as reported in the manuscript):
+- Python 3.10
+- pytorch == 2.5.1
+- timm == 1.0.8
+- transformers == 4.44.0
+- scanpy == 1.10.3
+- squidpy == 1.5.0
+- spatialdata == 0.2.5
+- scikit-image == 0.24.0
+- scikit-learn == 1.2.2
+- scipy == 1.13.1
+- shapely == 2.0.7
+Additional utilities required by the pipeline scripts:
+- torchvision (matching the PyTorch 2.5.1 release)
+- tifffile, Pillow, opencv-python-headless, pandas, tqdm
+- huggingface_hub
+- openslide-python (optional, for `.svs` files)
+### Operating systems tested
+- Ubuntu 22.04 LTS
+- Ubuntu 20.04 LTS
+(Not tested on Windows/macOS.)
+### Hardware requirements
+> **Note:** WSI processing is memory-intensive. This pipeline is designed for
+> server- or workstation-class hardware, not standard desktops.
+**Minimum (small WSIs, ~1–2 GB):**
+- GPU: NVIDIA GPU with ≥12 GB VRAM
+- RAM: 32 GB (64 GB strongly recommended)
+- Disk: 100 GB free
+**Recommended (typical WSIs, 2–10 GB):**
+- GPU: NVIDIA A100 / RTX 4090 / RTX 3090 (≥24 GB VRAM)
+- RAM: ≥128 GB
+- Disk: 500 GB+ free (intermediate `Aligned-hne.tif` can be 20–50 GB per slide)
+**Tested configurations:**
+- NVIDIA A100 (40 GB VRAM), 256 GB RAM, Ubuntu 22.04
+- NVIDIA RTX 3060 (12 GB VRAM), 64 GB RAM, Ubuntu 22.04
+CPU-only inference is not supported in practice — full WSI inference would take
+days even on a high-core-count CPU.
+---
+## Installation Guide
+### Recommended: Conda environment from `cellvit_rv3.yml`
+The repository includes a frozen conda environment file with all dependencies pinned
+to the exact versions used in the manuscript.
 ```bash
+# 1. Download environment file
+wget https://huggingface.co/roobee79/HNE2Cell/resolve/main/cellvit_rv3.yml
+# 2. Create environment
+conda env create -f cellvit_rv3.yml
+# 3. Activate
+conda activate cellvit_rv3
 ```
+**Typical install time:** ~10–15 minutes on a Linux server with a stable network connection
+(dominated by the PyTorch + CUDA toolkit download).
+### Download the model
 ```python
 from huggingface_hub import hf_hub_download
 )
 ```
 ---
+## Demo: Reproducible Walkthrough
+To verify your installation, run the pipeline on the example slide included in this
+repository (`TCGA-56-8628-01Z-00-DX1`, LUSC, ~36 MB).
 ### Download the model, example slide, and reference image
 └── patch_*_centroid.csv   # Cell centroids with type labels
 ```
+**Expected results on the example slide (`TCGA-56-8628-01Z-00-DX1`):**
+Approximately **63,000 cells** are detected across the 16 classes.
+Small variation (±a few percent) is expected between hardware configurations.
+### Expected runtime
+| Hardware | Full pipeline runtime |
+|---|---|
+| NVIDIA A100 (40 GB) + 256 GB RAM | ~20 min |
+| NVIDIA RTX 3060 (12 GB) + 64 GB RAM | ~30 min |
+A system without sufficient RAM (<32 GB) will fail at the normalization step
+due to full-resolution image loading.
 > The example slide is from **TCGA-LUSC** and is redistributed under the
 > [NIH Genomic Data Sharing Policy](https://sharing.nih.gov/genomic-data-sharing-policy).
 ---
+## Instructions for Use (On Your Own Data)
+```bash
+# Step 1: Color normalization (Reinhard method)
+python normalize.py \
+    --input_dir /path/to/slides \
+    --target /path/to/standard-ilc.tif
+# Step 2: Extract patches (40x recommended)
+python patchify.py \
+    --input_dir /path/to/slides \
+    --magnification 40 \
+    --patch_size 256 \
+    --overlap 64 \
+    --workers 8
+# Step 3: Cell detection & classification
+python inference.py \
+    --input_dir /path/to/patch_folders \
+    --output_dir /path/to/results \
+    --model_path ./HNE2cell_all_patch73_jit.pt \
+    --magnification 40 \
+    --batch_size 32
+```
+---
 ## Input / Output Details
 ### Input
 ```bibtex
 @misc{hne2cell,
   title={Spatial transcriptomics–supervised deep learning enables single-cell mapping of tumor immune architecture from routine histology},
+  year={2026},
   url={https://huggingface.co/roobee79/HNE2Cell}
 }
 ```
+The example slide is derived from data generated by the TCGA:
+<https://portal.gdc.cancer.gov/>.