Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
ChiefTheLord 's Collections
Finest Instructions 🧑🏻‍🍳
Vision-Language Alignment Datasets 🧊

Vision-Language Alignment Datasets 🧊

updated Jan 17

Collections of public datasets for Vision-Language modalities, especially for Frozen Vision Language Alignment.

Upvote
-

  • HuggingFaceM4/the_cauldron

    Viewer • Updated May 6, 2024 • 1.88M • 288k • 547

  • 5CD-AI/LLaVA-CoT-o1-Instruct

    Viewer • Updated Nov 27, 2024 • 58.5k • 24 • 110

  • kargwalaryan/SynCap-Flickr8k

    Viewer • Updated Oct 3, 2024 • 7.96k • 50 • 2

    Note SynthRecap


  • UCSC-VLAA/Recap-COCO-30K

    Viewer • Updated Jun 12, 2024 • 30.5k • 248 • 26

    Note SynthRecap


  • liuhaotian/LLaVA-Instruct-150K

    Preview • Updated Jan 3, 2024 • 3.65k • 616

  • Xkev/LLaVA-CoT-100k

    Viewer • Updated Dec 20, 2025 • 98.6k • 2.65k • 105

  • jackyhate/text-to-image-2M

    Viewer • Updated Apr 30 • 649k • 9.33k • 164

  • HuggingFaceM4/Docmatix

    Viewer • Updated Aug 26, 2024 • 2.55M • 13k • 305

  • CaptionEmporium/conceptual-captions-cc12m-llavanext

    Viewer • Updated Jun 30, 2024 • 11M • 711 • 23

  • IlyaGusev/gpt_roleplay_realm

    Viewer • Updated Apr 7, 2024 • 435 • 285 • 105
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs