Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution Paper • 2307.06304 • Published Jul 12, 2023 • 36
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 216