jshhhh commited on
Commit
15a2001
·
verified ·
1 Parent(s): bbc2b79

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -1,3 +1,32 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ pipeline_tag: image-to-text
6
+ tags:
7
+ - medical
8
+ - pathology
9
+ - vision-language
10
+ - contrastive-learning
11
+ - fine-grained
12
+ - multimodal
13
+ library_name: transformers
14
  ---
15
+ # PathFLIP
16
+
17
+ Model weights for the paper *PathFLIP: Fine-Grained Language-Image Pretraining for Versatile Pathology Image Understanding*.
18
+
19
+ ## Overview
20
+
21
+ PathFLIP is a pathology vision-language model that aligns fine-grained morphological sub-captions with their corresponding regions in Whole Slide Images. Unlike prior pathology VLMs that pair an entire slide with a single report-level anchor, PathFLIP introduces region-statement correspondence through a region Q-Former and a region-level contrastive objective with caption-swapped negatives, learning region-level alignment without any manual spatial annotation. This fine-grained supervision enables strong slide-level classification and retrieval performance, and gives rise to an emergent visual grounding capability.
22
+
23
+ ## Model Details
24
+
25
+ - **Base model**: *Qwen3-0.6B*
26
+ - **Training data**: [FGC-4K Dataset](https://huggingface.co/datasets/jshhhh/PathFLIP/)
27
+ - **Task**: classification, image-text retrieval, visual grounding, vqa
28
+ - **Languages**: English
29
+
30
+ ## License
31
+
32
+ This model is released under CC BY-NC 4.0 — free for academic and research use, **not for commercial use or clinical deployment**.