Qwen3-VL-235B-A22B-Instruct โ€” MLX nvfp4

MLX-format conversion of Qwen/Qwen3-VL-235B-A22B-Instruct (BF16 full precision) for Apple Silicon inference.

Quantization

Parameter Value
Format MLX safetensors
Quantization nvfp4
Bits per weight 4.528
Group size 32
Shards 24
Total size 133.41nvfp4 GB

Usage

pip install mlx-vlm

# Text generation
python -m mlx_vlm generate \
    --model LibraxisAI/Qwen3-VL-235B-A22B-Instruct-mlx-nvfp4 \
    --prompt "What model are you?" \
    --max-tokens 128

# Vision
python -m mlx_vlm generate \
    --model LibraxisAI/Qwen3-VL-235B-A22B-Instruct-mlx-nvfp4 \
    --image photo.jpg \
    --prompt "Describe this image in detail." \
    --max-tokens 256

Hardware Requirements

  • Apple Silicon with โ‰ฅ128 GB unified memory (tested on M3 Ultra 512 GB)
  • macOS 15+, MLX 0.30.4+

Model Details

  • Architecture: Qwen3-VL (Vision-Language Model) with Mixture of Experts (128 experts, top-k routing)
  • Parameters: 235B total, ~22B active per token
  • Capabilities: Text, image, and video understanding
  • Source: Converted from BF16 full precision checkpoint using patched mlx-vlm with per-tensor materialization to avoid Metal GPU timeout on large models

Conversion

Converted with mlx-vlm (patched for 235B+ model support):

python -m mlx_vlm convert \
    --hf-path Qwen/Qwen3-VL-235B-A22B-Instruct \
    -q --q-bits 4 --q-mode nvfp4 --q-group-size 32 \
    --mlx-path Qwen3-VL-235B-A22B-Instruct-mlx-nvfp4

Patches required for models >100B: per-tensor lazy weight materialization before quantization to prevent Metal command buffer timeout. See LibraxisAI/mlx-vlm for the fixes.

See Also


Vibecrafted with AI Agents by VetCoders (c)2026 The LibraxisAI Team

Downloads last month
131
Safetensors
Model size
59B params
Tensor type
U8
ยท
U32
ยท
BF16
ยท
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LibraxisAI/Qwen3-VL-235B-A22B-Instruct-mlx-nvfp4

Quantized
(27)
this model