How to use from the
Use from the
MLX library
# Download the model from the Hub
pip install huggingface_hub[hf_xet]

huggingface-cli download --local-dir md3p-int4 moondream/md3p-int4

MD3 Preview - Int4 Quantized (MLX)

Pre-quantized version of Moondream 3 Preview for MLX inference.

Quantization Details

  • MoE Experts: int4 affine quantization (bits=4, group_size=64)
  • Other weights: bf16 (unchanged)
  • Memory savings: ~60% reduction in MoE weight memory

Source

Quantized from moondream/moondream3-preview

Downloads last month
308
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for moondream/md3p-int4

Finetunes
1 model