InternVL3-8B-MLX-4bit / README.md

swaylenhayes

Add InternVL3-8B 4-bit MLX conversion

b770bfb verified 6 days ago

preview code

raw

history blame contribute delete

2.27 kB

metadata

language:
  - multilingual
license: other
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
library_name: mlx
base_model:
  - mlx-community/InternVL3-8B-bf16
tags:
  - mlx
  - mlx-vlm
  - internvl
  - internvl3
  - 4-bit
  - quantized
  - vision-language-model
  - apple-silicon
pipeline_tag: image-text-to-text

InternVL3-8B-MLX-4bit

This repository contains a 4-bit MLX quantized conversion of mlx-community/InternVL3-8B-bf16 for Apple Silicon inference.

Conversion Details

Setting	Value
Source model	`mlx-community/InternVL3-8B-bf16`
Conversion tool	`mlx_vlm.convert`
Quantization bits	`4`
Group size	`64`
Quantization mode	`affine`
Quant predicate	none (uniform quantization)

Conversion command used:

python3 -m mlx_vlm convert \
  --hf-path "mlx-community/InternVL3-8B-bf16" \
  --mlx-path "./models/InternVL3-8B-4bit" \
  -q --q-bits 4 --q-group-size 64

Validation

Test	Status
Text generation load test	passed

Verification command:

python3 -m mlx_vlm generate \
  --model "./models/InternVL3-8B-4bit" \
  --prompt "Reply with exactly: OK" \
  --max-tokens 8 --temperature 0

Observed response: OK

Usage

Install:

python3 -m pip install -U mlx-vlm

Run locally from this folder:

python3 -m mlx_vlm generate \
  --model "." \
  --prompt "Describe the image briefly." \
  --image path/to/image.jpg \
  --max-tokens 256 \
  --temperature 0

Run from Hugging Face after upload:

python3 -m mlx_vlm generate \
  --model "mlx-community/InternVL3-8B-MLX-4bit" \
  --prompt "Describe the image briefly." \
  --image path/to/image.jpg \
  --max-tokens 256 \
  --temperature 0

Notes

This conversion does not upload anything automatically.
Quantization changes numerical behavior relative to bf16 weights.
During local tests, mlx_vlm emitted an upstream tokenizer regex warning from the source model assets.

License

Follows the upstream model license terms from the source repository.

mlx-community
/

InternVL3-8B-MLX-4bit