| --- |
| license: mit |
| --- |
| |
| # Prompt2MedImage - Diffusion for Medical Images |
|
|
| Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset. |
|
|
| The weights here are itended to be used with the 🧨Diffusers library. |
|
|
| This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container. |
|
|
| ## Model Details |
| - **Developed by:** Nihir Chadderwala |
| - **Model type:** Diffusion based text to medical image generation model |
| - **Language:** English |
| - **License:** MiT |
| - **Model Description:** This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487). |
|
|
|
|
| ## License |
|
|
| This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage. |
| |
| - You can't use the model to deliberately produce nor share illegal or harmful outputs or content. |
| - The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use. |
| - You may re-distribute the weights and use the model commercially and/or as a service. |
| |
| |
| ## Run using PyTorch |
| |
| ```bash |
| pip install diffusers transformers |
| ``` |
| |
| Running pipeline with default PNDM scheduler: |
| |
| ```python |
| import torch |
| from diffusers import StableDiffusionPipeline |
| |
| model_id = "Prompt2MedImage" |
| device = "cuda" |
| |
| pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) |
| pipe = pipe.to(device) |
| |
| prompt = "Showing the subtrochanteric fracture in the porotic bone." |
| image = pipe(prompt).images[0] |
| |
| image.save("porotic_bone_fracture.png") |
| ``` |
| |
| ## Citation |
| |
| ``` |
| O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich, |
| "Radiology Objects in COntext (ROCO): A Multimodal Image Dataset". |
| MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018. |
| doi: 10.1007/978-3-030-01364-6_20 |
| ``` |
| |