Add model card for Osprey-7b

This PR adds a model card for the Osprey-7b model.

It includes:
- A link to the paper [Osprey: Pixel Understanding with Visual Instruction Tuning](https://huggingface.co/papers/2312.10032).
- The `pipeline_tag: image-text-to-text` metadata, ensuring the model appears in relevant searches.
- The `library_name: transformers` metadata, as the model is compatible with the Hugging Face `transformers` library.
- A link to the GitHub repository: https://github.com/CircleRadon/Osprey.
- A concise overview of the model's features.

Please review and merge this PR if it looks good.

Files changed (1) hide show

README.md +18 -0

README.md ADDED Viewed

	@@ -0,0 +1,18 @@

+---
+library_name: transformers
+pipeline_tag: image-text-to-text
+---
+# Osprey: Pixel Understanding with Visual Instruction Tuning
+[Osprey: Pixel Understanding with Visual Instruction Tuning](https://huggingface.co/papers/2312.10032)
+[Code](https://github.com/CircleRadon/Osprey)
+Osprey is a mask-text instruction tuning approach that extends MLLMs by incorporating pixel-wise mask regions into language instructions, enabling **fine-grained visual understanding**. Based on input mask region, Osprey generates semantic descriptions including **short description** and **detailed description**.
+Our Osprey can seamlessly integrate with [SAM](https://github.com/facebookresearch/segment-anything) in point-prompt, box-prompt and segmentation everything modes to generate the semantics associated with specific parts or objects.
+<p align="center" width="100%">
+<img src="https://github.com/CircleRadon/Osprey/raw/main/assets/osprey.png"  width="90%">
+</p>