SmolVLM: Redefining small and efficient multimodal models Paper β’ 2504.05299 β’ Published Apr 7 β’ 202
Vision Language Models Quantization Collection Vision Language Models (VLMs) quantized by Neural Magic β’ 20 items β’ Updated Mar 4 β’ 6
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. β’ 13 items β’ Updated 4 days ago β’ 34
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs β’ 9 items β’ Updated 4 days ago β’ 23
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12 β’ 479