--- datasets: - arup-ri/kinyoun_afb_50k language: - en tags: - medical --- This is an object detection model for finding Acid-fast bacteria (AFB) trained on tiles pulled from Kinyoun stained whole-slide images (WSIs). ## Model Details ### Model Description - **Developed by:** Applied AI & Bioinformatics group within the Research & Innovation unit of ARUP Labs - **Funded by:** ARUP Laboratories - **Model type:** Object Detection - **License:** [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) - **Finetuned from model:** [ConvNeXt Base pretrained on ImageNet1k](https://docs.pytorch.org/vision/stable/models/generated/torchvision.models.convnext_base.html) ### Model Sources - **Repository:** https://github.com/arup-ri/afb_detection - **Paper:** in press, to be updated ## Uses This model is intended to be used for research and academic purposes only. ### Direct Use The model was trained & is intended for detection of AFB on Kinyoun-stained WSIs. It can be applied to tiles from new Kinyoun-stained WSIs without fine-tuning, with the caveats mentioned below. ### Out-of-Scope Use The model was trained exclusively on Kinyoun-stained WSIs and has not been tested on other AFB staining techniques such as Ziehl-Neelsen; it is unknown how well it might generalize to such data. ## Bias, Risks, and Limitations Since the model's training data was limited to a single AFB clinical laboratory at a single institution, it is likely to suffer from some domain shift on data from other laboratories and WSI scanners. ### Recommendations Fine-tuning on new data could likely help bridge the domain shift to new data. For research use only! ## How to Get Started with the Model Training and inference code for this model [is available here](https://github.com/arup-ri/afb_detection). ## Training Details ### Training Data Training & testing datasets: https://huggingface.co/datasets/arup-ri/kinyoun_afb_50k Training data consisted of approximately 50k 256x256 pixel tiles. Approximately 20% of these tiles have bounding-box-annotated AFB, while the remaining tiles were taken from AFB negative slides to increase the diversity of background debris and artifacts that the model should learn to recognize as _not_ AFB. More details can be found in the paper. ### Training Procedure #### Preprocessing Tiles were extracted from WSIs and resized to a consistent physical size (0.2878 microns per pixel). Standard image augmentations such as flips, rotatations, weak Gaussian blurring, and brightness adjustments were randomly applied to images during training. #### Training Hyperparameters - **Training regime:** See yaml configs and source code for a complete listing and explanation of hyperparameters. #### Speeds, Sizes, Times On an L40S GPU, a training run to generate this model can be done in about 1-1.5 hours, depending heavily on choice of early stopping and to a lesser extent on hyperparameters such as batch size, etc. ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data Training & testing datasets: https://huggingface.co/datasets/arup-ri/kinyoun_afb_50k #### Metrics Object detection performance was evaluated with precision, recall, and F-score on a small held-out test set of tiles. Note that this test set was, like the training set, enriched for difficult tiles as described in more detail in the paper. Prediction of WSI labels (AFB positive or negative) was done by computing predicted AFB density, i.e., number of predicted AFB object detections divided by the are of tiles seen by the model, and thresholding this density. See the paper for more details. ### Results The best object detection F-score achieved on our test set was 0.53, though this would be substantially higher on a test set of randomly sampled tiles instead of a set enriched for difficult tiles. We quantified the WSI-level predictions with an ROC-like true _negative_ vs false _negative_ rate curve, and our our model achieved and AUC of 0.78 on our validation set of WSIs. ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** We made use of Nvidia L40s and H100 for various training runs with on-premise GPU hardware - **Hours used:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications ### Model Architecture and Objective [ConvNeXt backbone](https://docs.pytorch.org/vision/stable/models/convnext.html) with [FCOS object detection head](https://docs.pytorch.org/vision/stable/models/fcos.html). We also experimented with a [ResNet50 backbone](https://docs.pytorch.org/vision/stable/models/resnet.html) and [Faster R-CNN object detecttion head](https://docs.pytorch.org/vision/stable/models/faster_rcnn.html), but we found moderately improved performance with the ConvNeXt + FCOS pair. [Faster R-CNN also suffered from a nasty bug](https://github.com/pytorch/vision/issues/8206) which FCOS allowed us to circumvent. ### Compute Infrastructure #### Hardware On-prem L40S and H100 GPUs. #### Software Model architecture used torchvision implementations. [See source code for further detail](https://github.com/arup-ri/afb_detection). ## Citation **BibTeX:** ``` @article{english_use_2025, title = {Use of a convolutional neural network for direct detection of acid-fast bacilli from clinical specimens}, volume = {0}, url = {https://journals.asm.org/doi/10.1128/spectrum.00602-25}, doi = {10.1128/spectrum.00602-25}, number = {0}, urldate = {2025-06-25}, journal = {Microbiology Spectrum}, author = {English, Paul and Morrison, Muir J. and Mathison, Blaine and Enrico, Elizabeth and Shean, Ryan and O'Fallon, Brendan and Rupp, Deven and Knight, Katie and Rangel, Alexandra and Gilivary, Jeffrey and Vance, Amanda and Hatch, Haleina and Lin, Leo and Ng, David P. and Shakir, Salika M.}, month = jun, year = {2025}, note = {Publisher: American Society for Microbiology}, pages = {e00602--25}, file = {Full Text PDF:/Users/paul.english/Zotero/storage/PBFKYY88/English et al. - 2025 - Use of a convolutional neural network for direct detection of acid-fast bacilli from clinical specim.pdf:application/pdf}, } ```