File size: 1,383 Bytes

1f3c10d
 
 
 
 
 
 
 
 
 
 
90e26ce
1f3c10d
 
 
90e26ce
 
1f3c10d
90e26ce
1f3c10d
8c81fe9
1333fb0
16af69e
1f3c10d

---
license: apache-2.0
language:
- en
pipeline_tag: image-to-text
tags:
- medical
---

# Model Card for PathBLIP-2

A vision-language model built upon the BLIP-2 framework using BioGPT and HIPT for pathology report generation and cross-modal retrieval of melanocytic lesions.

## Model Details

This repository contains multiple checkpoints for the model which was used for the experiments in the paper.
The model was trained and evaluated on a dataset of 19,636 melanocytic lesion cases, consisting of one or more whole slide images (WSIs) and a pathology report, using different training configurations. 
The supporting code is available from the corresponding GitHub repository.
We refer to the paper for more information regarding the dataset, training, evaluation, and limitations.

- **Paper: *["On the Importance of Text Preprocessing for Multimodal Representation Learning and Pathology Report Generation"](https://openreview.net/forum?id=5fQchwJgQr&referrer=%5Bthe%20profile%20of%20Ruben%20Lucassen%5D(%2Fprofile%3Fid%3D~Ruben_Lucassen1))***
- **Repository:** [GitHub](https://github.com/nuldertien/PathBLIP-2)
- **Framework:** [BLIP-2](https://github.com/salesforce/LAVIS/tree/main/projects/blip2)
- **Base language model:** [BioGPT](https://huggingface.co/microsoft/biogpt)
- **WSI feature extractor:** [HIPT](https://github.com/mahmoodlab/HIPT)
- **License:** Apache-2.0