Qwen3.5-9B-MLX-bf16

This is a full-precision (bfloat16) MLX version of Qwen/Qwen3.5-9B for Apple Silicon.

Model Details

Original Model: Qwen/Qwen3.5-9B
Precision: bfloat16 (no quantization)
Format: MLX SafeTensors
Framework: mlx-vlm
Disk Size: ~18 GB

Conversion Details

This model was converted using mlx-vlm from the pc/fix-qwen35-predicate branch, which includes fixes for Qwen3.5 model support (proper handling of MoE gate layers, shared_expert_gate, and A_log casting).

Conversion command:

python3 -m mlx_vlm convert \
  --hf-path "Qwen/Qwen3.5-9B" \
  --mlx-path "./mlx_models/Qwen3.5-9B-MLX-bf16"

Important Note

A better, more optimized conversion may be available from @Prince (@Blaizzy) in the MLX VLM community. Check the mlx-community organization for updated versions as official Qwen3.5 support is merged into the main mlx-vlm branch.

Related Models

8-bit quantized version: mlx-community/Qwen3.5-9B-MLX-8bit

Usage

from mlx_vlm import load, generate

model, processor = load("mlx-community/Qwen3.5-9B-MLX-bf16")

output = generate(
    model,
    processor,
    prompt="Describe this image in detail",
    image="path/to/image.jpg",
    max_tokens=200
)
print(output)

Or from the command line:

mlx_vlm generate \
  --model mlx-community/Qwen3.5-9B-MLX-bf16 \
  --prompt "Describe this image" \
  --image path/to/image.jpg \
  --max-tokens 200

License

This model inherits the Apache 2.0 license from the original Qwen3.5-9B model.

Downloads last month: 243

Safetensors

Model size

9B params

Tensor type

BF16

F32

MLX

Hardware compatibility

Quantized

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/Qwen3.5-9B-MLX-bf16

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

(350)

this model