Add image-to-image pipeline tag and library name

This PR adds the `image-to-image` pipeline tag to the model card, ensuring that people can find this model at https://huggingface.co/models?pipeline_tag=image-to-image. It also adds the `library_name` tag for better searchability. It also changes the license mentioned in the "Model Description" to MIT to avoid inconsistencies.

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -1,5 +1,7 @@
 ---
 license: mit
 tags:
 - lumos
 - image to image
@@ -7,6 +9,7 @@ tags:
 - novel view synthesis
 - image to video
 ---
 <p align="center">
   <img src="asset/logo.gif"  height=20>
 </p>
@@ -41,7 +44,7 @@ Source code is available at https://github.com/xiaomabufei/lumos.
 - **Developed by:** Lumos
 - **Model type:** Diffusion-Transformer-based generative model
-- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
 - **Model Description:** **Lumos-I2I** is a model designed for generating images based on image prompts. It utilizes a [Transformer Latent Diffusion architecture](https://arxiv.org/abs/2310.00426) and incorporates a fixed, pretrained vision encoder ([DINO](
 https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth)). **Lumos-T2I** is a model that can be used to generate images based on text prompts.
 It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained text encoders ([T5](

 ---
 license: mit
+library_name: diffusers
+pipeline_tag: image-to-image
 tags:
 - lumos
 - image to image
 - novel view synthesis
 - image to video
 ---
 <p align="center">
   <img src="asset/logo.gif"  height=20>
 </p>
 - **Developed by:** Lumos
 - **Model type:** Diffusion-Transformer-based generative model
+- **License:** MIT
 - **Model Description:** **Lumos-I2I** is a model designed for generating images based on image prompts. It utilizes a [Transformer Latent Diffusion architecture](https://arxiv.org/abs/2310.00426) and incorporates a fixed, pretrained vision encoder ([DINO](
 https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth)). **Lumos-T2I** is a model that can be used to generate images based on text prompts.
 It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained text encoders ([T5](