Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
adhisetiawan 's Collections
Papers
Multimodal Models
SLMs
LLMs
Audio
Multimodal Papers

Multimodal Models

updated May 27, 2024
Upvote
-

  • microsoft/kosmos-2-patch14-224

    Image-to-Text • 2B • Updated Nov 28, 2023 • 136k • 182

  • Tyrannosaurus/TinyGPT-V

    Updated Jan 19, 2024 • 50

  • naver-clova-ix/donut-base

    Image-to-Text • Updated Aug 13, 2022 • 60.5k • 246

  • llava-hf/llava-v1.6-34b-hf

    Image-Text-to-Text • 35B • Updated Jan 27, 2025 • 2.81k • 93

  • deepseek-ai/deepseek-vl-7b-base

    7B • Updated Mar 15, 2024 • 328 • 64

  • deepseek-ai/deepseek-vl-7b-chat

    Image-Text-to-Text • 7B • Updated Mar 15, 2024 • 6.03k • 269

  • vikhyatk/moondream2

    Image-Text-to-Text • 2B • Updated Sep 23, 2025 • 3.74M • 1.37k

  • zai-org/cogvlm-chat-hf

    Text Generation • 18B • Updated Dec 19, 2023 • 825 • 198

  • Qwen/Qwen-VL-Chat

    Text Generation • Updated Jan 25, 2024 • 43.4k • 381

  • Qwen/Qwen-VL

    Text Generation • Updated Jan 25, 2024 • 32.2k • 276

  • microsoft/git-base

    Image-to-Text • 0.2B • Updated Apr 24, 2023 • 7.13k • 106
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs