--- license: apache-2.0 base_model: HuggingFaceTB/SmolVLM-Instruct tags: - vision-language - multimodal - chat - conversational - text-generation library_name: transformers pipeline_tag: text-generation --- # SmolVLM Final Merged This is a fine-tuned version of SmolVLM-Instruct, optimized for conversational AI and vision-language tasks. ## Model Details - **Base Model**: HuggingFaceTB/SmolVLM-Instruct - **Training**: Fine-tuned using LLaMA-Factory - **Use Cases**: Chat, vision understanding, multimodal reasoning - **License**: Apache 2.0 ## Usage ```python from transformers import AutoProcessor, AutoModelForVision2Seq import torch model = AutoModelForVision2Seq.from_pretrained("Tj/smolvlm-final-merged") processor = AutoProcessor.from_pretrained("Tj/smolvlm-final-merged") # Your inference code here