Instructions to use NexaAI/qwen3vl-4B-Instruct-fp16-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use NexaAI/qwen3vl-4B-Instruct-fp16-mlx with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("NexaAI/qwen3vl-4B-Instruct-fp16-mlx") config = load_config("NexaAI/qwen3vl-4B-Instruct-fp16-mlx") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use NexaAI/qwen3vl-4B-Instruct-fp16-mlx with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "NexaAI/qwen3vl-4B-Instruct-fp16-mlx"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "NexaAI/qwen3vl-4B-Instruct-fp16-mlx" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use NexaAI/qwen3vl-4B-Instruct-fp16-mlx with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "NexaAI/qwen3vl-4B-Instruct-fp16-mlx"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default NexaAI/qwen3vl-4B-Instruct-fp16-mlx
Run Hermes
hermes
Qwen3-VL-4B-Instruct
Run Qwen3-VL-4B-Instruct optimized for Apple Silicon on MLX with NexaSDK.
Quickstart
Install NexaSDK
Run the model locally with one line of code:
nexa infer NexaAI/qwen3vl-4B-fp16-mlx
Model Description
Qwen3-VL-4B-Instruct is a 4-billion-parameter instruction-tuned multimodal large language model from Alibaba Cloud’s Qwen team.
As part of the Qwen3-VL series, it fuses powerful vision-language understanding with conversational fine-tuning, optimized for real-world applications such as chat-based reasoning, document analysis, and visual dialogue.
The Instruct variant is tuned for following user prompts naturally and safely — producing concise, relevant, and user-aligned responses across text, image, and video contexts.
Features
- Instruction-Following: Optimized for dialogue, explanation, and user-friendly task completion.
- Vision-Language Fusion: Understands and reasons across text, images, and video frames.
- Multilingual Capability: Handles multiple languages for diverse global use cases.
- Contextual Coherence: Balances reasoning ability with natural, grounded conversational tone.
- Lightweight & Deployable: 4B parameters make it efficient for edge and device-level inference.
Use Cases
- Visual chatbots and assistants
- Image captioning and scene understanding
- Chart, document, or screenshot analysis
- Educational or tutoring systems with visual inputs
- Multilingual, multimodal question answering
Inputs and Outputs
Input:
- Text prompts, image(s), or mixed multimodal instructions.
Output:
- Natural-language responses or visual reasoning explanations.
- Can return structured text (summaries, captions, answers, etc.) depending on the prompt.
License
Refer to the official Qwen license for terms of use and redistribution.
- Downloads last month
- 18
Quantized
Model tree for NexaAI/qwen3vl-4B-Instruct-fp16-mlx
Base model
Qwen/Qwen3-VL-4B-Instruct