How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline

pipe = pipeline("image-to-text", model="jshhhh/PathFLIP")
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("jshhhh/PathFLIP", dtype="auto")
Quick Links

PathFLIP

Model weights for the paper PathFLIP: Fine-Grained Language-Image Pretraining for Versatile Pathology Image Understanding.

Overview

PathFLIP is a pathology vision-language model that aligns fine-grained morphological sub-captions with their corresponding regions in Whole Slide Images. Unlike prior pathology VLMs that pair an entire slide with a single report-level anchor, PathFLIP introduces region-statement correspondence through a region Q-Former and a region-level contrastive objective with caption-swapped negatives, learning region-level alignment without any manual spatial annotation. This fine-grained supervision enables strong slide-level classification and retrieval performance, and gives rise to an emergent visual grounding capability.

Model Details

  • Base model: Qwen3-0.6B
  • Training data: FGC-4K Dataset
  • Task: classification, image-text retrieval, visual grounding, vqa
  • Languages: English

License

This model is released under CC BY-NC 4.0 — free for academic and research use, not for commercial use or clinical deployment.

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
0.1B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support