Update readme
Browse files- README.md +39 -7
- assets/logo.png +0 -0
- assets/streamlit_app.png +0 -0
README.md
CHANGED
|
@@ -1,16 +1,48 @@
|
|
| 1 |
-
# MedCLIP
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
|
| 6 |
Each image is accompanied by a text caption. The caption length varies from a few characters (a single word) to 2,000 characters. During preprocessing we remove all images that has a caption shorter than 10 characters.
|
| 7 |
Training set: 57,780 images with their caption.
|
| 8 |
Validation set: 7.200
|
| 9 |
Test set: 7,650
|
| 10 |
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
This is the validation loss curve we observed when we trained the model using the `run_medclip.sh` script.
|
| 15 |

|
| 16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# MedCLIP: Fine-tuning a CLIP model on the ROCO medical dataset
|
| 2 |
+
|
| 3 |
+
<!--  -->
|
| 4 |
+
<h3 align="center">
|
| 5 |
+
<!-- <p>MedCLIP</p> -->
|
| 6 |
+
<img src="./assets/logo.png" alt="huggingface-medclip" width="250" height="250">
|
| 7 |
+
|
| 8 |
+
## Summary
|
| 9 |
+
This repository contains the code for fine-tuning a CLIP model on the [ROCO dataset](https://github.com/razorx89/roco-dataset), a dataset made of radiology images and a caption.
|
| 10 |
+
This work is done as a part of the [**Flax/Jax community week**](https://github.com/huggingface/transformers/blob/master/examples/research_projects/jax-projects/README.md#quickstart-flax-and-jax-in-transformers) organized by Hugging Face and Google.
|
| 11 |
+
|
| 12 |
+
### Demo
|
| 13 |
+
You can try a Streamlit demo app that uses this model on [🤗 Spaces](https://huggingface.co/spaces/kaushalya/medclip-roco). You may have to signup for 🤗 Spaces private beta to access this app (screenshot shown below).
|
| 14 |
+

|
| 15 |
+
|
| 16 |
+
🤗 Hub Model card: https://huggingface.co/flax-community/medclip-roco
|
| 17 |
+
## Dataset
|
| 18 |
|
| 19 |
Each image is accompanied by a text caption. The caption length varies from a few characters (a single word) to 2,000 characters. During preprocessing we remove all images that has a caption shorter than 10 characters.
|
| 20 |
Training set: 57,780 images with their caption.
|
| 21 |
Validation set: 7.200
|
| 22 |
Test set: 7,650
|
| 23 |
|
| 24 |
+
[ ] Give an example
|
| 25 |
+
|
| 26 |
+
## Installation 💽
|
| 27 |
+
This repo depends on the master branch of [Hugging Face - Transformers library](https://github.com/huggingface/transformers). First you need to clone the transformers repository and then install it locally (preferably inside a virtual environment) with `pip install -e ".[flax]"`.
|
| 28 |
|
| 29 |
+
## Model
|
| 30 |
+
You can load the pretrained model from the Hugging Face Hub with
|
| 31 |
+
```
|
| 32 |
+
from medclip.modeling_hybrid_clip import FlaxHybridCLIP
|
| 33 |
+
|
| 34 |
+
model = FlaxHybridCLIP.from_pretrained("flax-community/medclip-roco")
|
| 35 |
+
```
|
| 36 |
+
## Training
|
| 37 |
+
The model is trained using Flax/JAX on a cloud TPU-v3-8.
|
| 38 |
+
You can fine-tune a CLIP model implemented in Flax by simply running `sh run_medclip`.
|
| 39 |
This is the validation loss curve we observed when we trained the model using the `run_medclip.sh` script.
|
| 40 |

|
| 41 |
+
|
| 42 |
+
## TODO
|
| 43 |
+
[ ] Evaluation on down-stream tasks
|
| 44 |
+
|
| 45 |
+
[ ] Zero-shot learning performance
|
| 46 |
+
|
| 47 |
+
[ ] Merge the demo app
|
| 48 |
+
|
assets/logo.png
ADDED
|
assets/streamlit_app.png
ADDED
|