Ensemble MLX Vault

⚠️ Important Legal & Usage Notice

This repository is a Private Personal Storage Vault

Sourcing & Ownership

Not an Original Creator: The models stored in this repository are NOT created, trained, or owned by the owner of this repository.
Sourced from MLX-Community: All models are redistributed from the MLX-Community organization on Hugging Face.
Purpose: This repository serves as a optimized, mobile-ready mirror of specific 4-bit quantized versions to ensure consistent and fast access.

Compliance & Licenses

The weights in this repository are subject to the original licenses provided by their respective authors (Meta, Google, Mistral, Microsoft, Sarvam AI, Alibaba/Qwen, etc.).

Users must adhere to the individual model licenses (e.g., Llama 3.2 Community License, Apache 2.0, MIT) as defined by the original creators.
This repository is provided "as is" without any warranties.

Repository Overview

This vault contains 30 model snapshots, each optimized for Apple Silicon via the MLX framework.

Selection Criteria:

Size: Strictly under 4 Billion parameters.
Quantization: 4-bit (Safetensors).
Weight Limit: Every model folder is < 3.5GB.

Key Included Series:

Qwen 3.5 / 2.5 (VL, Coder, and Base)
Gemma 3 / 2 (Vision and Text)
Llama 3.2 / DeepSeek R1 (Instruct and Reasoning)
Sarvam AI (Indic and Translation)
Phi 4 / 3.5 / 2
SmolLM2 / Danube / Falcon 3

Sync Status

Last Synchronized: March 10, 2026. Contact: via Private Message on HF for access inquiries for the Ensemble AI team.

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

Quantized

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support