File size: 183 Bytes
b2bb309 |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
---
language: en
license: mit
tags:
- multimodal
- vision-language
- captioning
---
# Multimodal Caption Model
A model designed to generate textual descriptions from visual inputs.
|