Does it contain the Vision-Language Projector MLP?

by MrDojo0 - opened Jun 24, 2025

Does this model contain the MLP, that translates the embedding space of the ViT into the decoder's embedding space?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment