HuggingFaceM4/the_cauldron
Viewer
•
Updated
•
1.88M
•
51.8k
•
518
Collections of public datasets for Vision-Language modalities, especially for Frozen Vision Language Alignment.
Note SynthRecap
Note SynthRecap