|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
tags: |
|
|
- language |
|
|
- granite-4.0 |
|
|
- mlx |
|
|
- open4bits |
|
|
base_model: ibm-granite/granite-4.0-micro |
|
|
--- |
|
|
|
|
|
# Open4bits / Granite-4.0-Micro-MLX-3Bit |
|
|
|
|
|
This repository provides the **Granite-4.0 Micro model quantized to 3-bit in MLX format**, published by Open4bits to enable efficient local inference with low memory usage and broad hardware compatibility. |
|
|
|
|
|
The underlying Granite-4.0 model and architecture are **developed and owned by their original authors**. This repository contains only a 3-bit quantized MLX conversion of the original model weights. |
|
|
|
|
|
The model is designed for lightweight, high-performance text generation and instruction-following tasks, making it suitable for local and resource-constrained environments. |
|
|
|
|
|
Open4bits has started supporting **MLX models** to broaden compatibility with emerging quantization formats and efficient runtimes. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
Granite-4.0 Micro is a compact variant of the Granite-4.0 architecture optimized for efficient inference and lower resource footprints. |
|
|
This release provides a **3-bit quantized checkpoint in MLX format**, enabling fast inference on CPUs and supported accelerators with reduced memory demands. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
* **Base Model:** Granite-4.0 |
|
|
* **Variant:** Micro |
|
|
* **Quantization:** 3-bit |
|
|
* **Format:** MLX |
|
|
* **Task:** Text generation, instruction following |
|
|
* **Weight tying:** Preserved |
|
|
* **Compatibility:** MLX-enabled inference engines and supported runtimes |
|
|
|
|
|
This quantized format balances inference performance with lower resource requirements while preserving core architectural design. |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is intended for: |
|
|
|
|
|
* Local text generation and chat applications |
|
|
* CPU-based or resource-efficient deployments |
|
|
* Research, experimentation, and prototyping |
|
|
* Offline or self-hosted AI systems |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
* Reduced performance compared to full-precision variants |
|
|
* Output quality depends on prompt engineering and inference settings |
|
|
* Not fine-tuned for highly domain-specific tasks |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
This model follows the **Apache licence 2.0** of the base Granite-4.0 model. |
|
|
Users must comply with the licensing conditions defined by the original creators. |
|
|
|
|
|
--- |
|
|
|
|
|
## Support |
|
|
|
|
|
If you find this model useful, please consider supporting the project. |
|
|
Your support encourages Open4bits to continue releasing and maintaining efficient open models for the community. |
|
|
|
|
|
|