Add model card for Osprey-7b

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +18 -0
README.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: image-text-to-text
4
+ ---
5
+
6
+ # Osprey: Pixel Understanding with Visual Instruction Tuning
7
+
8
+ [Osprey: Pixel Understanding with Visual Instruction Tuning](https://huggingface.co/papers/2312.10032)
9
+
10
+ [Code](https://github.com/CircleRadon/Osprey)
11
+
12
+ Osprey is a mask-text instruction tuning approach that extends MLLMs by incorporating pixel-wise mask regions into language instructions, enabling **fine-grained visual understanding**. Based on input mask region, Osprey generates semantic descriptions including **short description** and **detailed description**.
13
+
14
+ Our Osprey can seamlessly integrate with [SAM](https://github.com/facebookresearch/segment-anything) in point-prompt, box-prompt and segmentation everything modes to generate the semantics associated with specific parts or objects.
15
+
16
+ <p align="center" width="100%">
17
+ <img src="https://github.com/CircleRadon/Osprey/raw/main/assets/osprey.png" width="90%">
18
+ </p>