EZCon
/

SmolVLM2-2.2B-Instruct-mlx

Image-Text-to-Text

video-text-to-text

Model card Files Files and versions

SmolVLM2-2.2B-Instruct-mlx / README.md

hermeschen-ezcon's picture

hermeschen-ezcon

Upload folder using huggingface_hub

ef26e82 verified about 3 hours ago

|

history blame contribute delete

1.02 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- HuggingFaceM4/the_cauldron
	- HuggingFaceM4/Docmatix
	- lmms-lab/LLaVA-OneVision-Data
	- lmms-lab/M4-Instruct-Data
	- HuggingFaceFV/finevideo
	- MAmmoTH-VL/MAmmoTH-VL-Instruct-12M
	- lmms-lab/LLaVA-Video-178K
	- orrzohar/Video-STaR
	- Mutonix/Vript
	- TIGER-Lab/VISTA-400K
	- Enxin/MovieChat-1K_train
	- ShareGPT4Video/ShareGPT4Video
	pipeline_tag: image-text-to-text
	tags:
	- video-text-to-text
	- mlx
	language:
	- en
	base_model:
	- HuggingFaceTB/SmolVLM-Instruct
	---

	# EZCon/SmolVLM2-2.2B-Instruct-mlx
	This model was converted to MLX format from [`HuggingFaceTB/SmolVLM2-2.2B-Instruct`]() using mlx-vlm version 0.3.11.
	Refer to the [original model card](https://huggingface.co/HuggingFaceTB/SmolVLM2-2.2B-Instruct) for more details on the model.
	## Use with mlx

	```bash
	pip install -U mlx-vlm
	```

	```bash
	python -m mlx_vlm.generate --model EZCon/SmolVLM2-2.2B-Instruct-mlx --max-tokens 100 --temperature 0.0 --prompt "Describe this image." --image <path_to_image>
	```