Thank you so much, If I apply your model in my research , how can I inform you or cite you , are you willing to hear me more , I wish I could connect with you. "mahbub.hassan@ieee.org" my email. Thank you
Introducing prithivMLmods/DeepCaption-VLA-7B, a multimodal VLM designed for reasoning with long-shot captions (Captioning and Vision-Language Attribution). It focuses on defining visual properties, object attributes, and scene details across a wide spectrum of images and aspect ratios, generating attribute-rich image captions. The model supports creative, artistic, and technical applications that require detailed descriptions. 🤗🔥