FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model Paper • 2510.10921 • Published Oct 13 • 10
FG-CLIP 2 Collection FG-CLIP 2 is the foundation model for fine-grained vision-language understanding in both English and Chinese. • 10 items • Updated Nov 6 • 5