Osprey-7b / README.md

nielsr HF Staff

Add model card for Osprey-7b

ef2c5e3 verified 8 months ago

957 Bytes

library_name: transformers
pipeline_tag: image-text-to-text

Osprey: Pixel Understanding with Visual Instruction Tuning

Code

Osprey is a mask-text instruction tuning approach that extends MLLMs by incorporating pixel-wise mask regions into language instructions, enabling fine-grained visual understanding. Based on input mask region, Osprey generates semantic descriptions including short description and detailed description.

Our Osprey can seamlessly integrate with SAM in point-prompt, box-prompt and segmentation everything modes to generate the semantics associated with specific parts or objects.