|
|
--- |
|
|
license: cc-by-nc-nd-4.0 |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- Pathology |
|
|
- arxiv:2505.11404 |
|
|
|
|
|
extra_gated_prompt: >- |
|
|
The Patho-CLIP-B model and its associated materials are released under the CC-BY-NC-ND 4.0 license. |
|
|
Access is restricted to non-commercial, academic research purposes only, with proper citation required. |
|
|
Any commercial usage, redistribution, or derivative work (including training models based on this model or generating datasets from its outputs) |
|
|
is strictly prohibited without prior written approval. |
|
|
|
|
|
Users must register with an official institutional email address (generic domains such as @gmail, @qq, @hotmail, etc. will not be accepted). |
|
|
By requesting access, you confirm that your information is accurate and current, and that you agree to comply with all terms listed herein. |
|
|
If other members of your organization wish to use the model, they must register independently and agree to the same terms. |
|
|
|
|
|
extra_gated_fields: |
|
|
Full name (first and last): text |
|
|
Institutional affiliation (no abbreviations): text |
|
|
Role/Position: |
|
|
type: select |
|
|
options: |
|
|
- Faculty/Principal Investigator |
|
|
- PhD Student |
|
|
- Postdoctoral Researcher |
|
|
- Research Staff |
|
|
- Other |
|
|
Official institutional email (**must match your Hugging Face primary email; generic domains will be denied**): text |
|
|
Intended research use (be specific): text |
|
|
I agree to use this model only for non-commercial academic purposes: checkbox |
|
|
I agree not to redistribute this model or share it outside of my individual usage: checkbox |
|
|
I confirm that all submitted information is accurate and up to date: checkbox |
|
|
--- |
|
|
\[[Arxiv](https://arxiv.org/abs/2505.11404)\] | \[[Github Repo](https://github.com/Wenchuan-Zhang/Patho-R1)] | \[[Cite](#citation❤️)\] |
|
|
## Introduction📝 |
|
|
To bridge the gap between fine-grained tissue morphology and clinical semantic understanding in pathology, we present **Patho-CLIP-B**, a vision-language model tailored for high-resolution cross-modal representation learning in pathological diagnosis. |
|
|
|
|
|
**Patho-CLIP-B** is built on the OpenAI-CLIP-B architecture and trained through a two-stage progressive paradigm: |
|
|
|
|
|
Stage I: Contrastive pretraining on PathGen-1.6M, focusing on cell morphology and tissue organization to embed high-resolution visual priors |
|
|
|
|
|
Stage II: Joint training on a 3.5M composite corpus comprising PathGen-1.6M, Quilt-1M, PathCap, and a textbook-derived dataset, to integrate domain-specific semantics with morphological features |
|
|
|
|
|
This strategy enables **Patho-CLIP-B** to achieve strong performance in semantic alignment, cross-modal retrieval, and tissue-level discrimination, offering a robust foundation for downstream pathology tasks. |
|
|
|
|
|
## Acknowledgements🎖 |
|
|
We gratefully acknowledge the [OpenCLIP](https://github.com/mlfoundations/open_clip) project for providing an efficient and extensible implementation of CLIP models. Its flexible training pipeline, model support, and strong community contributions significantly facilitated the development and training of our Patho-CLIP-B model. |
|
|
|
|
|
We thank the authors and maintainers for their excellent work. |
|
|
|
|
|
## Citation❤️ |
|
|
If you find our work helpful, a citation would be greatly appreciated: |
|
|
|
|
|
``` |
|
|
@article{zhang2025patho, |
|
|
title={Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner}, |
|
|
author={Zhang, Wenchuan and Zhang, Penghao and Guo, Jingru and Cheng, Tao and Chen, Jie and Zhang, Shuwan and Zhang, Zhang and Yi, Yuhao and Bu, Hong}, |
|
|
journal={arXiv preprint arXiv:2505.11404}, |
|
|
year={2025} |
|
|
} |
|
|
``` |