HuggingFaceM4/the_cauldron
Viewer
•
Updated
•
1.88M
•
51.7k
•
511
Collections of public datasets for Vision-Language modalities, especially for Frozen Vision Language Alignment.
Note SynthRecap
Note SynthRecap