roobee79 commited on
Commit
c368846
Β·
verified Β·
1 Parent(s): dc44fbd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -42
README.md CHANGED
@@ -33,10 +33,10 @@ It takes 256Γ—256 px patches as input and outputs per-cell contours, centroids,
33
  The full pipeline consists of three steps:
34
 
35
  ```
36
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
37
- β”‚ 1. Normalize β”‚ ──→ β”‚ 2. Patchify β”‚ ──→ β”‚ 3. Inference β”‚
38
- β”‚ (Reinhard) β”‚ β”‚ (256px, 64ov) β”‚ β”‚ (Cell Detection)β”‚
39
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
40
  SVS / TIF PNG patches Masks + Centroids
41
  ```
42
 
@@ -52,17 +52,85 @@ The full pipeline consists of three steps:
52
 
53
  ---
54
 
55
- ## Quick Start
56
 
57
- ### 1. Install dependencies
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
  ```bash
60
- pip install torch torchvision tifffile scikit-image pillow opencv-python-headless pandas tqdm
61
- # Optional (for .svs files):
62
- pip install openslide-python
 
 
 
 
 
63
  ```
64
 
65
- ### 2. Download the model
 
 
 
 
66
 
67
  ```python
68
  from huggingface_hub import hf_hub_download
@@ -73,37 +141,12 @@ model_path = hf_hub_download(
73
  )
74
  ```
75
 
76
- ### 3. Run the full pipeline
77
-
78
- ```bash
79
- # Step 1: Color normalization (Reinhard method)
80
- python normalize.py \
81
- --input_dir /path/to/slides \
82
- --target /path/to/standard-ilc.tif
83
-
84
- # Step 2: Extract patches (40x recommended)
85
- python patchify.py \
86
- --input_dir /path/to/slides \
87
- --magnification 40 \
88
- --patch_size 256 \
89
- --overlap 64 \
90
- --workers 8
91
-
92
- # Step 3: Cell detection & classification
93
- python inference.py \
94
- --input_dir /path/to/patch_folders \
95
- --output_dir /path/to/results \
96
- --model_path ./HNE2cell_all_patch73_jit.pt \
97
- --magnification 40 \
98
- --batch_size 32
99
- ```
100
-
101
  ---
102
 
103
- ## Example: Reproducible Walkthrough
104
 
105
- To verify your installation, run the pipeline on the example slide included in this repository
106
- (`TCGA-56-8628-01Z-00-DX1`, LUSC, ~36 MB).
107
 
108
  ### Download the model, example slide, and reference image
109
 
@@ -167,13 +210,52 @@ example/results/
167
  └── patch_*_centroid.csv # Cell centroids with type labels
168
  ```
169
 
170
- Approximate runtime on a single NVIDIA A100: **~20 min** for the full slide
 
 
 
 
 
 
 
 
 
 
 
 
171
 
172
  > The example slide is from **TCGA-LUSC** and is redistributed under the
173
  > [NIH Genomic Data Sharing Policy](https://sharing.nih.gov/genomic-data-sharing-policy).
174
 
175
  ---
176
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
  ## Input / Output Details
178
 
179
  ### Input
@@ -245,10 +327,10 @@ If you use HNE2Cell in your research, please cite:
245
  ```bibtex
246
  @misc{hne2cell,
247
  title={Spatial transcriptomics–supervised deep learning enables single-cell mapping of tumor immune architecture from routine histology},
248
- year={2025},
249
  url={https://huggingface.co/roobee79/HNE2Cell}
250
  }
251
  ```
252
 
253
- The example slide is derived from data generated by the TCGA Research Network:
254
- <https://www.cancer.gov/tcga>.
 
33
  The full pipeline consists of three steps:
34
 
35
  ```
36
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
37
+ β”‚ 1. Normalize β”‚ ──→ β”‚ 2. Patchify β”‚ ──→ β”‚ 3. Inference β”‚
38
+ β”‚ (Reinhard) β”‚ β”‚ (256px, 64ov) β”‚ β”‚ (Cell Detection) β”‚
39
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
40
  SVS / TIF PNG patches Masks + Centroids
41
  ```
42
 
 
52
 
53
  ---
54
 
55
+ ## System Requirements
56
 
57
+ ### Software dependencies (tested versions)
58
+
59
+ Core packages (as reported in the manuscript):
60
+
61
+ - Python 3.10
62
+ - pytorch == 2.5.1
63
+ - timm == 1.0.8
64
+ - transformers == 4.44.0
65
+ - scanpy == 1.10.3
66
+ - squidpy == 1.5.0
67
+ - spatialdata == 0.2.5
68
+ - scikit-image == 0.24.0
69
+ - scikit-learn == 1.2.2
70
+ - scipy == 1.13.1
71
+ - shapely == 2.0.7
72
+
73
+ Additional utilities required by the pipeline scripts:
74
+
75
+ - torchvision (matching the PyTorch 2.5.1 release)
76
+ - tifffile, Pillow, opencv-python-headless, pandas, tqdm
77
+ - huggingface_hub
78
+ - openslide-python (optional, for `.svs` files)
79
+
80
+ ### Operating systems tested
81
+
82
+ - Ubuntu 22.04 LTS
83
+ - Ubuntu 20.04 LTS
84
+
85
+ (Not tested on Windows/macOS.)
86
+
87
+ ### Hardware requirements
88
+
89
+ > **Note:** WSI processing is memory-intensive. This pipeline is designed for
90
+ > server- or workstation-class hardware, not standard desktops.
91
+
92
+ **Minimum (small WSIs, ~1–2 GB):**
93
+ - GPU: NVIDIA GPU with β‰₯12 GB VRAM
94
+ - RAM: 32 GB (64 GB strongly recommended)
95
+ - Disk: 100 GB free
96
+
97
+ **Recommended (typical WSIs, 2–10 GB):**
98
+ - GPU: NVIDIA A100 / RTX 4090 / RTX 3090 (β‰₯24 GB VRAM)
99
+ - RAM: β‰₯128 GB
100
+ - Disk: 500 GB+ free (intermediate `Aligned-hne.tif` can be 20–50 GB per slide)
101
+
102
+ **Tested configurations:**
103
+ - NVIDIA A100 (40 GB VRAM), 256 GB RAM, Ubuntu 22.04
104
+ - NVIDIA RTX 3060 (12 GB VRAM), 64 GB RAM, Ubuntu 22.04
105
+
106
+ CPU-only inference is not supported in practice β€” full WSI inference would take
107
+ days even on a high-core-count CPU.
108
+
109
+ ---
110
+
111
+ ## Installation Guide
112
+
113
+ ### Recommended: Conda environment from `cellvit_rv3.yml`
114
+
115
+ The repository includes a frozen conda environment file with all dependencies pinned
116
+ to the exact versions used in the manuscript.
117
 
118
  ```bash
119
+ # 1. Download environment file
120
+ wget https://huggingface.co/roobee79/HNE2Cell/resolve/main/cellvit_rv3.yml
121
+
122
+ # 2. Create environment
123
+ conda env create -f cellvit_rv3.yml
124
+
125
+ # 3. Activate
126
+ conda activate cellvit_rv3
127
  ```
128
 
129
+ **Typical install time:** ~10–15 minutes on a Linux server with a stable network connection
130
+ (dominated by the PyTorch + CUDA toolkit download).
131
+
132
+
133
+ ### Download the model
134
 
135
  ```python
136
  from huggingface_hub import hf_hub_download
 
141
  )
142
  ```
143
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
  ---
145
 
146
+ ## Demo: Reproducible Walkthrough
147
 
148
+ To verify your installation, run the pipeline on the example slide included in this
149
+ repository (`TCGA-56-8628-01Z-00-DX1`, LUSC, ~36 MB).
150
 
151
  ### Download the model, example slide, and reference image
152
 
 
210
  └── patch_*_centroid.csv # Cell centroids with type labels
211
  ```
212
 
213
+ **Expected results on the example slide (`TCGA-56-8628-01Z-00-DX1`):**
214
+ Approximately **63,000 cells** are detected across the 16 classes.
215
+ Small variation (Β±a few percent) is expected between hardware configurations.
216
+
217
+ ### Expected runtime
218
+
219
+ | Hardware | Full pipeline runtime |
220
+ |---|---|
221
+ | NVIDIA A100 (40 GB) + 256 GB RAM | ~20 min |
222
+ | NVIDIA RTX 3060 (12 GB) + 64 GB RAM | ~30 min |
223
+
224
+ A system without sufficient RAM (<32 GB) will fail at the normalization step
225
+ due to full-resolution image loading.
226
 
227
  > The example slide is from **TCGA-LUSC** and is redistributed under the
228
  > [NIH Genomic Data Sharing Policy](https://sharing.nih.gov/genomic-data-sharing-policy).
229
 
230
  ---
231
 
232
+ ## Instructions for Use (On Your Own Data)
233
+
234
+ ```bash
235
+ # Step 1: Color normalization (Reinhard method)
236
+ python normalize.py \
237
+ --input_dir /path/to/slides \
238
+ --target /path/to/standard-ilc.tif
239
+
240
+ # Step 2: Extract patches (40x recommended)
241
+ python patchify.py \
242
+ --input_dir /path/to/slides \
243
+ --magnification 40 \
244
+ --patch_size 256 \
245
+ --overlap 64 \
246
+ --workers 8
247
+
248
+ # Step 3: Cell detection & classification
249
+ python inference.py \
250
+ --input_dir /path/to/patch_folders \
251
+ --output_dir /path/to/results \
252
+ --model_path ./HNE2cell_all_patch73_jit.pt \
253
+ --magnification 40 \
254
+ --batch_size 32
255
+ ```
256
+
257
+ ---
258
+
259
  ## Input / Output Details
260
 
261
  ### Input
 
327
  ```bibtex
328
  @misc{hne2cell,
329
  title={Spatial transcriptomics–supervised deep learning enables single-cell mapping of tumor immune architecture from routine histology},
330
+ year={2026},
331
  url={https://huggingface.co/roobee79/HNE2Cell}
332
  }
333
  ```
334
 
335
+ The example slide is derived from data generated by the TCGA:
336
+ <https://portal.gdc.cancer.gov/>.