md3p-int8 / README.md

Upload folder using huggingface_hub

9729a9e verified about 1 month ago

1.64 kB

license: apache-2.0
tags:
  - vision
  - moondream
  - mlx
  - int8
  - quantized
base_model: moondream/moondream3-preview

MD3P-Int8 - INT8 Quantized Moondream3 for MLX

An INT8 quantized version of Moondream3, offering a balance between model quality and size for MLX deployment.

Model Details

Model	Size	Quality	Use Case
md3p-int8 (this)	10 GB	Higher	Desktop/Server MLX
md3p-int4	6.48 GB	Medium	Memory-constrained
md3p-int4-smol	5.43 GB	Lower	iOS (~6GB limit)

This model is designed for use with MLX-based Moondream implementations.

# Example with mlx-lm or similar
from mlx_lm import load, generate

model, tokenizer = load("lewi/md3p-int8")

Thanks to the Moondream team for the original model and Apache 2.0 license.