--- license: apache-2.0 datasets: - allenai/Molmo2-Cap - allenai/Molmo2-VideoCapQA - allenai/Molmo2-VideoSubtitleQA - allenai/Molmo2-AskModelAnything - allenai/Molmo2-VideoPoint - allenai/Molmo2-VideoTrack - allenai/Molmo2-MultiImageQA - allenai/Molmo2-SynMultiImageQA - allenai/Molmo2-MultiImagePoint language: - en base_model: - Qwen/Qwen3-8B - google/siglip-so400m-patch14-384 pipeline_tag: video-text-to-text library_name: transformers tags: - multimodal - olmo - molmo - molmo2 - mlx --- # mlx-community/Molmo2-8B-4bit This model was converted to MLX format from [`allenai/Molmo2-8B`]() using mlx-vlm version **0.3.10**. Refer to the [original model card](https://huggingface.co/allenai/Molmo2-8B) for more details on the model. ## Use with mlx ```bash pip install -U mlx-vlm ``` ```bash python -m mlx_vlm.generate --model mlx-community/Molmo2-8B-4bit --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image ```