| library_name: transformers | |
| license: apache-2.0 | |
| datasets: | |
| - HuggingFaceM4/the_cauldron | |
| - HuggingFaceM4/Docmatix | |
| - lmms-lab/LLaVA-OneVision-Data | |
| - lmms-lab/M4-Instruct-Data | |
| - HuggingFaceFV/finevideo | |
| - MAmmoTH-VL/MAmmoTH-VL-Instruct-12M | |
| - lmms-lab/LLaVA-Video-178K | |
| - orrzohar/Video-STaR | |
| - Mutonix/Vript | |
| - TIGER-Lab/VISTA-400K | |
| - Enxin/MovieChat-1K_train | |
| - ShareGPT4Video/ShareGPT4Video | |
| pipeline_tag: image-text-to-text | |
| tags: | |
| - video-text-to-text | |
| - mlx | |
| language: | |
| - en | |
| base_model: | |
| - HuggingFaceTB/SmolVLM-Instruct | |
| # EZCon/SmolVLM2-2.2B-Instruct-mlx | |
| This model was converted to MLX format from [`HuggingFaceTB/SmolVLM2-2.2B-Instruct`]() using mlx-vlm version **0.3.11**. | |
| Refer to the [original model card](https://huggingface.co/HuggingFaceTB/SmolVLM2-2.2B-Instruct) for more details on the model. | |
| ## Use with mlx | |
| ```bash | |
| pip install -U mlx-vlm | |
| ``` | |
| ```bash | |
| python -m mlx_vlm.generate --model EZCon/SmolVLM2-2.2B-Instruct-mlx --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image> | |
| ``` | |