Add Gemma 4-26B-A4B support: 4.15 tok/s on M4 Mac Mini 3f56a7b
Nico Claude Opus 4.6 (1M context) commited on
How to use waltgrace/mlx-expert-sniper with MLX:
# Make sure mlx-vlm is installed
# pip install --upgrade mlx-vlm
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
# Load the model
model, processor = load("waltgrace/mlx-expert-sniper")
config = load_config("waltgrace/mlx-expert-sniper")
# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."
# Apply chat template
formatted_prompt = apply_chat_template(
processor, config, prompt, num_images=1
)
# Generate output
output = generate(model, processor, formatted_prompt, image)
print(output)