Ensemble MLX Vault
⚠️ Important Legal & Usage Notice
This repository is a Private Personal Storage Vault
Sourcing & Ownership
- Not an Original Creator: The models stored in this repository are NOT created, trained, or owned by the owner of this repository.
- Sourced from MLX-Community: All models are redistributed from the MLX-Community organization on Hugging Face.
- Purpose: This repository serves as a optimized, mobile-ready mirror of specific 4-bit quantized versions to ensure consistent and fast access.
Compliance & Licenses
The weights in this repository are subject to the original licenses provided by their respective authors (Meta, Google, Mistral, Microsoft, Sarvam AI, Alibaba/Qwen, etc.).
- Users must adhere to the individual model licenses (e.g., Llama 3.2 Community License, Apache 2.0, MIT) as defined by the original creators.
- This repository is provided "as is" without any warranties.
Repository Overview
This vault contains 30 model snapshots, each optimized for Apple Silicon via the MLX framework.
Selection Criteria:
- Size: Strictly under 4 Billion parameters.
- Quantization: 4-bit (Safetensors).
- Weight Limit: Every model folder is < 3.5GB.
Key Included Series:
- Qwen 3.5 / 2.5 (VL, Coder, and Base)
- Gemma 3 / 2 (Vision and Text)
- Llama 3.2 / DeepSeek R1 (Instruct and Reasoning)
- Sarvam AI (Indic and Translation)
- Phi 4 / 3.5 / 2
- SmolLM2 / Danube / Falcon 3
Sync Status
Last Synchronized: March 10, 2026. Contact: via Private Message on HF for access inquiries for the Ensemble AI team.
Hardware compatibility
Log In to add your hardware
Quantized
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support