WenchuanZhang
/

Patho-CLIP-B

Model card Files Files and versions

Patho-CLIP-B / README.md

WenchuanZhang's picture

Update README.md

b2b0658 verified about 2 months ago

|

history blame contribute delete

3.56 kB

	---
	license: cc-by-nc-nd-4.0
	language:
	- en
	tags:
	- Pathology
	- arxiv:2505.11404

	extra_gated_prompt: >-
	The Patho-CLIP-B model and its associated materials are released under the CC-BY-NC-ND 4.0 license.
	Access is restricted to non-commercial, academic research purposes only, with proper citation required.
	Any commercial usage, redistribution, or derivative work (including training models based on this model or generating datasets from its outputs)
	is strictly prohibited without prior written approval.

	Users must register with an official institutional email address (generic domains such as @gmail, @qq, @hotmail, etc. will not be accepted).
	By requesting access, you confirm that your information is accurate and current, and that you agree to comply with all terms listed herein.
	If other members of your organization wish to use the model, they must register independently and agree to the same terms.

	extra_gated_fields:
	Full name (first and last): text
	Institutional affiliation (no abbreviations): text
	Role/Position:
	type: select
	options:
	- Faculty/Principal Investigator
	- PhD Student
	- Postdoctoral Researcher
	- Research Staff
	- Other
	Official institutional email (must match your Hugging Face primary email; generic domains will be denied): text
	Intended research use (be specific): text
	I agree to use this model only for non-commercial academic purposes: checkbox
	I agree not to redistribute this model or share it outside of my individual usage: checkbox
	I confirm that all submitted information is accurate and up to date: checkbox
	---
	\[[Arxiv](https://arxiv.org/abs/2505.11404)\] \| \[[Github Repo](https://github.com/Wenchuan-Zhang/Patho-R1)] \| \[[Cite](#citation❤️)\]
	## Introduction📝
	To bridge the gap between fine-grained tissue morphology and clinical semantic understanding in pathology, we present Patho-CLIP-B, a vision-language model tailored for high-resolution cross-modal representation learning in pathological diagnosis.

	Patho-CLIP-B is built on the OpenAI-CLIP-B architecture and trained through a two-stage progressive paradigm:

	Stage I: Contrastive pretraining on PathGen-1.6M, focusing on cell morphology and tissue organization to embed high-resolution visual priors

	Stage II: Joint training on a 3.5M composite corpus comprising PathGen-1.6M, Quilt-1M, PathCap, and a textbook-derived dataset, to integrate domain-specific semantics with morphological features

	This strategy enables Patho-CLIP-B to achieve strong performance in semantic alignment, cross-modal retrieval, and tissue-level discrimination, offering a robust foundation for downstream pathology tasks.

	## Acknowledgements🎖
	We gratefully acknowledge the [OpenCLIP](https://github.com/mlfoundations/open_clip) project for providing an efficient and extensible implementation of CLIP models. Its flexible training pipeline, model support, and strong community contributions significantly facilitated the development and training of our Patho-CLIP-B model.

	We thank the authors and maintainers for their excellent work.

	## Citation❤️
	If you find our work helpful, a citation would be greatly appreciated:

	```
	@article{zhang2025patho,
	title={Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner},
	author={Zhang, Wenchuan and Zhang, Penghao and Guo, Jingru and Cheng, Tao and Chen, Jie and Zhang, Shuwan and Zhang, Zhang and Yi, Yuhao and Bu, Hong},
	journal={arXiv preprint arXiv:2505.11404},
	year={2025}
	}
	```