StevenHH2000/Fine-R1-3B
Image-Text-to-Text
•
4B
•
Updated
•
16
•
1
Welcome to Fine-R1 👋, which is the first MLLM to surpass various strong CLIP-like models in fine-grained visual recognition.