--- tags: - smolvlm2 - mirror library_name: transformers license: apache-2.0 --- # SmolVLM2-500M-Video-Instruct (full mirror) Full mirror of [HuggingFaceTB/SmolVLM2-500M-Video-Instruct](https://huggingface.co/HuggingFaceTB/SmolVLM2-500M-Video-Instruct). Includes: - `model.safetensors` (~1.9 GB PyTorch weights) - 14 ONNX variants under `onnx/` (fp16, int8, q4, uint8, etc. for decoder / embed_tokens / vision_encoder) - Tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`, `added_tokens.json`, `special_tokens_map.json`) - Processor configs (`processor_config.json`, `preprocessor_config.json`, `chat_template.json`) - `generation_config.json`, `config.json` Mirrored via `huggingface_hub.snapshot_download`. ## Usage ```python from transformers import AutoModel, AutoProcessor, AutoTokenizer model = AutoModel.from_pretrained("arrow-hf/SmolVLM2-500M-Video-Instruct") processor = AutoProcessor.from_pretrained("arrow-hf/SmolVLM2-500M-Video-Instruct") tokenizer = AutoTokenizer.from_pretrained("arrow-hf/SmolVLM2-500M-Video-Instruct") ``` ## Related The tokenizer is used by [arrow-hf/smolvla-robotwin-stack-bowls-two-50pct](https://huggingface.co/arrow-hf/smolvla-robotwin-stack-bowls-two-50pct) (max_length=48). The SmolVLA policy is fine-tuned on top of this base VLM.