view article Article Everything You Need to Know about Knowledge Distillation Kseniase • Mar 6, 2025 • 80
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 212
EVLM: An Efficient Vision-Language Model for Visual Understanding Paper • 2407.14177 • Published Jul 19, 2024 • 45
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control Paper • 2407.12781 • Published Jul 17, 2024 • 13
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos Paper • 2407.12679 • Published Jul 17, 2024 • 8