mlx-community/Qwopus3.6-35B-A3B-Coder-8bit

This model mlx-community/Qwopus3.6-35B-A3B-Coder-8bit was converted to MLX format from Jackrong/Qwopus3.6-35B-A3B-Coder using mlx-vlm version 0.4.4.

This is a 8bit MLX quantized conversion. It keeps the source model's chat template and multimodal processor configuration for text/coding, image, and video-style inputs. The language model weights were quantized with MLX 8-bit affine quantization; the multimodal vision components are preserved for image/video inputs.

Refer to the original model card for model details, license, and intended use.

Use with mlx

pip install -U mlx-vlm

Image input

python -m mlx_vlm.generate \
  --model mlx-community/Qwopus3.6-35B-A3B-Coder-8bit \
  --max-tokens 512 \
  --temperature 0.0 \
  --prompt "Describe this image." \
  --image <path_to_image>

Text / coding input

python -m mlx_vlm.generate \
  --model mlx-community/Qwopus3.6-35B-A3B-Coder-8bit \
  --max-tokens 512 \
  --temperature 0.2 \
  --prompt "Write a Python function that parses a JSONL file and counts records by label."

Notes

This is a 8bit MLX quantized version of Jackrong/Qwopus3.6-35B-A3B-Coder.
The model is intended for Apple Silicon inference with MLX.
For multimodal usage, prefer mlx-vlm rather than plain mlx-lm.
License: Apache 2.0, inherited from the source model metadata.

Conversion

mlx_vlm.convert \
  --hf-path Jackrong/Qwopus3.6-35B-A3B-Coder \
  --mlx-path Qwopus3.6-35B-A3B-Coder-8bit \
  --quantize \
  --q-bits 8 \
  --q-group-size 64 \
  --q-mode affine