File size: 6,238 Bytes
36dc26d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
# Deep Tissue Detector - DenseNet121

A deep learning model for detecting tissue regions in Whole Slide Images (WSI) of histopathology slides.

## Model Description

This is a DenseNet121-based tissue detector trained to classify image patches into three categories for histopathology analysis. The model is specifically designed to identify tissue-containing regions in H&E (Hematoxylin and Eosin) stained whole slide images, enabling efficient processing of large pathology datasets.

### Model Details

- **Architecture**: DenseNet121
- **Input Size**: 224x224 RGB images
- **Number of Classes**: 3
  - Class 0: Background/Non-tissue
  - Class 1: Artifact/Low-quality
  - Class 2: Tissue (high quality)
- **Parameters**: ~7M
- **Model Size**: 28MB
- **Framework**: PyTorch
- **License**: GPL-3.0

### Intended Use

This model is designed for:
- Filtering tissue-containing patches from whole slide images
- Quality control in computational pathology pipelines
- Preprocessing for downstream cancer analysis tasks
- Reducing computational burden by identifying regions of interest

**Recommended Usage**: Accept patches where class 2 (tissue) probability ≥ 0.8

## Usage

### Installation

```bash
pip install torch torchvision huggingface_hub pillow
```

### Basic Usage

```python
from tissue_detector import TissueDetector
from PIL import Image

# Initialize detector (automatically downloads weights from HuggingFace)
detector = TissueDetector(device='cuda')

# Load and process an image
image = Image.open('patch.png')
patch_transformed = detector.transforms(image)
patch_batch = patch_transformed.unsqueeze(0).to(detector.device)

# Get predictions
import torch
with torch.no_grad():
    prediction = detector.model(patch_batch)
    probabilities = torch.nn.functional.softmax(prediction, dim=1).cpu().numpy()[0]
    tissue_class = probabilities.argmax()

# Check if patch contains tissue (class 2 with threshold 0.8)
is_tissue = tissue_class == 2 and probabilities[2] >= 0.8
print(f"Tissue detected: {is_tissue} (confidence: {probabilities[2]:.3f})")
```

### Integration with HoneyBee Framework

```python
from honeybee.models import TissueDetector
from honeybee.processors.wsi import SimpleSlide

# Initialize tissue detector
detector = TissueDetector(device='cuda')

# Load whole slide image
slide = SimpleSlide(
    slide_path='path/to/slide.svs',
    tile_size=512,
    max_patches=100,
    tissue_detector=detector
)

# Extract tissue patches (automatically filtered)
patches = slide.load_patches_concurrently(target_patch_size=224)
```

### Manual Download

```python
from huggingface_hub import hf_hub_download

# Download PyTorch weights
model_path = hf_hub_download(
    repo_id="Lab-Rasool/tissue-detector",
    filename="deep-tissue-detector_densenet_state-dict.pt"
)

# Or download SafeTensors format (recommended)
model_path = hf_hub_download(
    repo_id="Lab-Rasool/tissue-detector",
    filename="model.safetensors"
)
```

## Model Performance

The model has been extensively used in cancer classification, survival analysis, and multimodal integration tasks within the HoneyBee framework. It effectively filters:

- **Background regions**: Glass slide backgrounds, whitespace
- **Artifacts**: Pen marks, dust, blur, fold artifacts
- **Tissue regions**: High-quality H&E stained tissue for analysis

**Recommended threshold**: 0.8 probability for class 2 provides a good balance between recall and precision for tissue detection.

## Training Data

The model was originally trained as part of the [SliDL](https://github.com/markowetzlab/slidl) project for tissue detection in histopathology whole slide images. Training details include H&E stained tissue sections with annotations for tissue/non-tissue regions.

## Preprocessing

The model expects images preprocessed with standard ImageNet normalization:

```python
from torchvision import transforms

preprocessing = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])
```

## Limitations and Biases

- **Stain Variation**: Performance may vary with different staining protocols or scanners
- **Tissue Types**: Primarily validated on H&E stained tissue; may not generalize to other stains
- **Resolution**: Designed for patches at standard WSI resolution (~0.5 µm/pixel)
- **Artifacts**: May misclassify unusual artifacts not seen during training
- **Medical Use**: This model is for research purposes only and not intended for clinical diagnosis

## Applications in Research

This tissue detector has been used in:
- **Cancer Classification**: TCGA multi-cancer type classification (BRCA, BLCA, KIRC, LIHC, etc.)
- **Survival Analysis**: Cox proportional hazards and deep survival models
- **Stain Normalization Studies**: Evaluating impact of Macenko/Reinhard normalization
- **Multimodal Integration**: Combining pathology with clinical, radiology, and molecular data
- **Foundation Model Evaluation**: Preprocessing for UNI, UNI2, Virchow2 embeddings

## Citation

If you use this model, please cite the HoneyBee framework and the original SliDL project:

```bibtex
@software{honeybee2024,
  title={HoneyBee: A Scalable Modular Framework for Multimodal AI in Oncology},
  author={Lab-Rasool},
  year={2024},
  url={https://github.com/lab-rasool/HoneyBee}
}

@software{slidl,
  title={SliDL: A Python library for deep learning on whole-slide images},
  author={Markowetz Lab},
  url={https://github.com/markowetzlab/slidl}
}
```

## Model Card Authors

Lab-Rasool

## Model Card Contact

For questions or issues, please open an issue on the [HoneyBee GitHub repository](https://github.com/lab-rasool/HoneyBee).

## Additional Resources

- **HoneyBee Framework**: [https://github.com/lab-rasool/HoneyBee](https://github.com/lab-rasool/HoneyBee)
- **Documentation**: [https://lab-rasool.github.io/HoneyBee](https://lab-rasool.github.io/HoneyBee)
- **Original Model Source**: [SliDL GitHub](https://github.com/markowetzlab/slidl)

## Version History

- **v1.0** (2024): Initial release with DenseNet121 architecture
  - PyTorch state dict format
  - SafeTensors format for improved loading
  - Integration with HuggingFace Hub