MLX
Safetensors
siglip

how did you quantize this?

#1
by LeePapa - opened

trying to figure out how to quantize siglip2 models - specifically i want to quantize this variant (fancyfeast/so400m-long ) since it supports 256 tokens and not the default 64. when i try mlx-vlm or mlx-lm it says siglip2 is not supported.

MLX Community org
edited 1 day ago

Hey, yeah I don't believe mlx-vlm or mlx-lm support SigLIP 2 since it's technically not a vlm or llm. There's a sister package named mlx-embeddings that I contributed SigLIP 2 support to which should work for your purposes.

Just due to the naming of the model on hugging face, you'll have to download the model manually, rename the folder to something like "fancyfeast-so400m-long-patch14-384" and pass in the path to that as the hf_path when calling the convert() func in order for the image and patch size to be correctly detected.

In case there's still any confusion here's a snippet of my code

from mlx_embeddings.utils import convert

convert(
    hf_path="google/siglip2-base-patch16-224",
    mlx_path="Siglip2-base-patch16-224-8bit",
    quantize=True,
    q_bits=8,
    skip_vision=False
)

Sign up or log in to comment