File size: 183 Bytes
b2bb309
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
---
language: en
license: mit
tags:
- multimodal
- vision-language
- captioning
---

# Multimodal Caption Model

A model designed to generate textual descriptions from visual inputs.