File size: 3,206 Bytes

---
metrics:
- accuracy
tags:
- code
---
##MR-PLIP Description 
MR-PLIP is a vision-language foundation model trained on the multiresolution 34 million images curated from TCGA dataset. It can perform various vision-language processing (VLP) tasks such as image classification, detection and segmentation. 
##Uses
As per the original CLIP model card, this model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, arbitrary image classification. We also hope it can be used for interdisciplinary studies of the potential impact of such model.
Direct Use
Zero-shot image classification, object detection and segmentation.
Downstream Use
Image classification and other image task fine-tuning, linear probe image classification, image generation guiding and conditioning, among others.
Intended Use
The model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, arbitrary image classification. We also hope it can be used for interdisciplinary studies of the potential impact of such models.
Primary intended uses
The primary intended users of these models are AI researchers.
We primarily imagine the model will be used by researchers to better understand robustness, generalization, and other capabilities, biases, and constraints of computer vision histopathology models.
Out-of-Scope Use Cases
Any deployed use case of the model - whether commercial or not - is currently out of scope. Non-deployed use cases such as image search in a constrained environment, are also not recommended unless there is thorough in-domain testing of the model with a specific, fixed class taxonomy.
Since the model has not been purposefully trained in or evaluated on any languages other than English, its use should be limited to English language use cases.
Further the above notice, the Quilt-1M dataset used in training of these models has additional considerations, see below.
Training Data
This model was trained with 34 million is an image-text dataset for histopathology. Curated from whole slide images from TCGA dataset. It contributes the largest dataset for vision language modeling in histopathology.

IMPORTANT NOTE: The motivation behind dataset creation is to democratize research and experimentation around large-scale multi-modal model training and handling of uncurated, large-scale histopathology datasets crawled from publically available internet. Our recommendation is therefore to use the dataset for research purposes.
Disclaimer
It is important to note that the results obtained from this function are not intended to constitute medical advice or replace consultation with a qualified medical professional. The use of this function is solely at your own risk and should be consistent with applicable laws, regulations, and ethical considerations. We do not warrant or guarantee the accuracy, completeness, suitability, or usefulness of this function for any particular purpose, and we hereby disclaim any liability arising from any reliance placed on this function or any results obtained from its use.