Update README.md
Browse files
README.md
CHANGED
|
@@ -33,10 +33,10 @@ It takes 256Γ256 px patches as input and outputs per-cell contours, centroids,
|
|
| 33 |
The full pipeline consists of three steps:
|
| 34 |
|
| 35 |
```
|
| 36 |
-
βββββββββββββββββ
|
| 37 |
-
β 1. Normalize β βββ β 2. Patchify β βββ β 3. Inference
|
| 38 |
-
β (Reinhard) β β (256px, 64ov) β β (Cell Detection)β
|
| 39 |
-
βββββββββββββββββ
|
| 40 |
SVS / TIF PNG patches Masks + Centroids
|
| 41 |
```
|
| 42 |
|
|
@@ -52,17 +52,85 @@ The full pipeline consists of three steps:
|
|
| 52 |
|
| 53 |
---
|
| 54 |
|
| 55 |
-
##
|
| 56 |
|
| 57 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
```bash
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
```
|
| 64 |
|
| 65 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
```python
|
| 68 |
from huggingface_hub import hf_hub_download
|
|
@@ -73,37 +141,12 @@ model_path = hf_hub_download(
|
|
| 73 |
)
|
| 74 |
```
|
| 75 |
|
| 76 |
-
### 3. Run the full pipeline
|
| 77 |
-
|
| 78 |
-
```bash
|
| 79 |
-
# Step 1: Color normalization (Reinhard method)
|
| 80 |
-
python normalize.py \
|
| 81 |
-
--input_dir /path/to/slides \
|
| 82 |
-
--target /path/to/standard-ilc.tif
|
| 83 |
-
|
| 84 |
-
# Step 2: Extract patches (40x recommended)
|
| 85 |
-
python patchify.py \
|
| 86 |
-
--input_dir /path/to/slides \
|
| 87 |
-
--magnification 40 \
|
| 88 |
-
--patch_size 256 \
|
| 89 |
-
--overlap 64 \
|
| 90 |
-
--workers 8
|
| 91 |
-
|
| 92 |
-
# Step 3: Cell detection & classification
|
| 93 |
-
python inference.py \
|
| 94 |
-
--input_dir /path/to/patch_folders \
|
| 95 |
-
--output_dir /path/to/results \
|
| 96 |
-
--model_path ./HNE2cell_all_patch73_jit.pt \
|
| 97 |
-
--magnification 40 \
|
| 98 |
-
--batch_size 32
|
| 99 |
-
```
|
| 100 |
-
|
| 101 |
---
|
| 102 |
|
| 103 |
-
##
|
| 104 |
|
| 105 |
-
To verify your installation, run the pipeline on the example slide included in this
|
| 106 |
-
(`TCGA-56-8628-01Z-00-DX1`, LUSC, ~36 MB).
|
| 107 |
|
| 108 |
### Download the model, example slide, and reference image
|
| 109 |
|
|
@@ -167,13 +210,52 @@ example/results/
|
|
| 167 |
βββ patch_*_centroid.csv # Cell centroids with type labels
|
| 168 |
```
|
| 169 |
|
| 170 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 171 |
|
| 172 |
> The example slide is from **TCGA-LUSC** and is redistributed under the
|
| 173 |
> [NIH Genomic Data Sharing Policy](https://sharing.nih.gov/genomic-data-sharing-policy).
|
| 174 |
|
| 175 |
---
|
| 176 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 177 |
## Input / Output Details
|
| 178 |
|
| 179 |
### Input
|
|
@@ -245,10 +327,10 @@ If you use HNE2Cell in your research, please cite:
|
|
| 245 |
```bibtex
|
| 246 |
@misc{hne2cell,
|
| 247 |
title={Spatial transcriptomicsβsupervised deep learning enables single-cell mapping of tumor immune architecture from routine histology},
|
| 248 |
-
year={
|
| 249 |
url={https://huggingface.co/roobee79/HNE2Cell}
|
| 250 |
}
|
| 251 |
```
|
| 252 |
|
| 253 |
-
The example slide is derived from data generated by the TCGA
|
| 254 |
-
<https://
|
|
|
|
| 33 |
The full pipeline consists of three steps:
|
| 34 |
|
| 35 |
```
|
| 36 |
+
βββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββ
|
| 37 |
+
β 1. Normalize β βββ β 2. Patchify β βββ β 3. Inference β
|
| 38 |
+
β (Reinhard) β β (256px, 64ov) β β (Cell Detection) β
|
| 39 |
+
βββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββ
|
| 40 |
SVS / TIF PNG patches Masks + Centroids
|
| 41 |
```
|
| 42 |
|
|
|
|
| 52 |
|
| 53 |
---
|
| 54 |
|
| 55 |
+
## System Requirements
|
| 56 |
|
| 57 |
+
### Software dependencies (tested versions)
|
| 58 |
+
|
| 59 |
+
Core packages (as reported in the manuscript):
|
| 60 |
+
|
| 61 |
+
- Python 3.10
|
| 62 |
+
- pytorch == 2.5.1
|
| 63 |
+
- timm == 1.0.8
|
| 64 |
+
- transformers == 4.44.0
|
| 65 |
+
- scanpy == 1.10.3
|
| 66 |
+
- squidpy == 1.5.0
|
| 67 |
+
- spatialdata == 0.2.5
|
| 68 |
+
- scikit-image == 0.24.0
|
| 69 |
+
- scikit-learn == 1.2.2
|
| 70 |
+
- scipy == 1.13.1
|
| 71 |
+
- shapely == 2.0.7
|
| 72 |
+
|
| 73 |
+
Additional utilities required by the pipeline scripts:
|
| 74 |
+
|
| 75 |
+
- torchvision (matching the PyTorch 2.5.1 release)
|
| 76 |
+
- tifffile, Pillow, opencv-python-headless, pandas, tqdm
|
| 77 |
+
- huggingface_hub
|
| 78 |
+
- openslide-python (optional, for `.svs` files)
|
| 79 |
+
|
| 80 |
+
### Operating systems tested
|
| 81 |
+
|
| 82 |
+
- Ubuntu 22.04 LTS
|
| 83 |
+
- Ubuntu 20.04 LTS
|
| 84 |
+
|
| 85 |
+
(Not tested on Windows/macOS.)
|
| 86 |
+
|
| 87 |
+
### Hardware requirements
|
| 88 |
+
|
| 89 |
+
> **Note:** WSI processing is memory-intensive. This pipeline is designed for
|
| 90 |
+
> server- or workstation-class hardware, not standard desktops.
|
| 91 |
+
|
| 92 |
+
**Minimum (small WSIs, ~1β2 GB):**
|
| 93 |
+
- GPU: NVIDIA GPU with β₯12 GB VRAM
|
| 94 |
+
- RAM: 32 GB (64 GB strongly recommended)
|
| 95 |
+
- Disk: 100 GB free
|
| 96 |
+
|
| 97 |
+
**Recommended (typical WSIs, 2β10 GB):**
|
| 98 |
+
- GPU: NVIDIA A100 / RTX 4090 / RTX 3090 (β₯24 GB VRAM)
|
| 99 |
+
- RAM: β₯128 GB
|
| 100 |
+
- Disk: 500 GB+ free (intermediate `Aligned-hne.tif` can be 20β50 GB per slide)
|
| 101 |
+
|
| 102 |
+
**Tested configurations:**
|
| 103 |
+
- NVIDIA A100 (40 GB VRAM), 256 GB RAM, Ubuntu 22.04
|
| 104 |
+
- NVIDIA RTX 3060 (12 GB VRAM), 64 GB RAM, Ubuntu 22.04
|
| 105 |
+
|
| 106 |
+
CPU-only inference is not supported in practice β full WSI inference would take
|
| 107 |
+
days even on a high-core-count CPU.
|
| 108 |
+
|
| 109 |
+
---
|
| 110 |
+
|
| 111 |
+
## Installation Guide
|
| 112 |
+
|
| 113 |
+
### Recommended: Conda environment from `cellvit_rv3.yml`
|
| 114 |
+
|
| 115 |
+
The repository includes a frozen conda environment file with all dependencies pinned
|
| 116 |
+
to the exact versions used in the manuscript.
|
| 117 |
|
| 118 |
```bash
|
| 119 |
+
# 1. Download environment file
|
| 120 |
+
wget https://huggingface.co/roobee79/HNE2Cell/resolve/main/cellvit_rv3.yml
|
| 121 |
+
|
| 122 |
+
# 2. Create environment
|
| 123 |
+
conda env create -f cellvit_rv3.yml
|
| 124 |
+
|
| 125 |
+
# 3. Activate
|
| 126 |
+
conda activate cellvit_rv3
|
| 127 |
```
|
| 128 |
|
| 129 |
+
**Typical install time:** ~10β15 minutes on a Linux server with a stable network connection
|
| 130 |
+
(dominated by the PyTorch + CUDA toolkit download).
|
| 131 |
+
|
| 132 |
+
|
| 133 |
+
### Download the model
|
| 134 |
|
| 135 |
```python
|
| 136 |
from huggingface_hub import hf_hub_download
|
|
|
|
| 141 |
)
|
| 142 |
```
|
| 143 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
---
|
| 145 |
|
| 146 |
+
## Demo: Reproducible Walkthrough
|
| 147 |
|
| 148 |
+
To verify your installation, run the pipeline on the example slide included in this
|
| 149 |
+
repository (`TCGA-56-8628-01Z-00-DX1`, LUSC, ~36 MB).
|
| 150 |
|
| 151 |
### Download the model, example slide, and reference image
|
| 152 |
|
|
|
|
| 210 |
βββ patch_*_centroid.csv # Cell centroids with type labels
|
| 211 |
```
|
| 212 |
|
| 213 |
+
**Expected results on the example slide (`TCGA-56-8628-01Z-00-DX1`):**
|
| 214 |
+
Approximately **63,000 cells** are detected across the 16 classes.
|
| 215 |
+
Small variation (Β±a few percent) is expected between hardware configurations.
|
| 216 |
+
|
| 217 |
+
### Expected runtime
|
| 218 |
+
|
| 219 |
+
| Hardware | Full pipeline runtime |
|
| 220 |
+
|---|---|
|
| 221 |
+
| NVIDIA A100 (40 GB) + 256 GB RAM | ~20 min |
|
| 222 |
+
| NVIDIA RTX 3060 (12 GB) + 64 GB RAM | ~30 min |
|
| 223 |
+
|
| 224 |
+
A system without sufficient RAM (<32 GB) will fail at the normalization step
|
| 225 |
+
due to full-resolution image loading.
|
| 226 |
|
| 227 |
> The example slide is from **TCGA-LUSC** and is redistributed under the
|
| 228 |
> [NIH Genomic Data Sharing Policy](https://sharing.nih.gov/genomic-data-sharing-policy).
|
| 229 |
|
| 230 |
---
|
| 231 |
|
| 232 |
+
## Instructions for Use (On Your Own Data)
|
| 233 |
+
|
| 234 |
+
```bash
|
| 235 |
+
# Step 1: Color normalization (Reinhard method)
|
| 236 |
+
python normalize.py \
|
| 237 |
+
--input_dir /path/to/slides \
|
| 238 |
+
--target /path/to/standard-ilc.tif
|
| 239 |
+
|
| 240 |
+
# Step 2: Extract patches (40x recommended)
|
| 241 |
+
python patchify.py \
|
| 242 |
+
--input_dir /path/to/slides \
|
| 243 |
+
--magnification 40 \
|
| 244 |
+
--patch_size 256 \
|
| 245 |
+
--overlap 64 \
|
| 246 |
+
--workers 8
|
| 247 |
+
|
| 248 |
+
# Step 3: Cell detection & classification
|
| 249 |
+
python inference.py \
|
| 250 |
+
--input_dir /path/to/patch_folders \
|
| 251 |
+
--output_dir /path/to/results \
|
| 252 |
+
--model_path ./HNE2cell_all_patch73_jit.pt \
|
| 253 |
+
--magnification 40 \
|
| 254 |
+
--batch_size 32
|
| 255 |
+
```
|
| 256 |
+
|
| 257 |
+
---
|
| 258 |
+
|
| 259 |
## Input / Output Details
|
| 260 |
|
| 261 |
### Input
|
|
|
|
| 327 |
```bibtex
|
| 328 |
@misc{hne2cell,
|
| 329 |
title={Spatial transcriptomicsβsupervised deep learning enables single-cell mapping of tumor immune architecture from routine histology},
|
| 330 |
+
year={2026},
|
| 331 |
url={https://huggingface.co/roobee79/HNE2Cell}
|
| 332 |
}
|
| 333 |
```
|
| 334 |
|
| 335 |
+
The example slide is derived from data generated by the TCGA:
|
| 336 |
+
<https://portal.gdc.cancer.gov/>.
|