Open4bits
/

granite-4.0-micro-mlx-3Bit

Text Generation

granitemoehybrid

Model card Files Files and versions

granite-4.0-micro-mlx-3Bit / README.md

fmasterpro27's picture

Update README.md

53d1a7e verified 5 days ago

|

history blame contribute delete

2.43 kB

	---
	license: apache-2.0
	library_name: transformers
	tags:
	- language
	- granite-4.0
	- mlx
	- open4bits
	base_model: ibm-granite/granite-4.0-micro
	---

	# Open4bits / Granite-4.0-Micro-MLX-3Bit

	This repository provides the Granite-4.0 Micro model quantized to 3-bit in MLX format, published by Open4bits to enable efficient local inference with low memory usage and broad hardware compatibility.

	The underlying Granite-4.0 model and architecture are developed and owned by their original authors. This repository contains only a 3-bit quantized MLX conversion of the original model weights.

	The model is designed for lightweight, high-performance text generation and instruction-following tasks, making it suitable for local and resource-constrained environments.

	Open4bits has started supporting MLX models to broaden compatibility with emerging quantization formats and efficient runtimes.

	---

	## Model Overview

	Granite-4.0 Micro is a compact variant of the Granite-4.0 architecture optimized for efficient inference and lower resource footprints.
	This release provides a 3-bit quantized checkpoint in MLX format, enabling fast inference on CPUs and supported accelerators with reduced memory demands.

	---

	## Model Details

	* Base Model: Granite-4.0
	* Variant: Micro
	* Quantization: 3-bit
	* Format: MLX
	* Task: Text generation, instruction following
	* Weight tying: Preserved
	* Compatibility: MLX-enabled inference engines and supported runtimes

	This quantized format balances inference performance with lower resource requirements while preserving core architectural design.

	---

	## Intended Use

	This model is intended for:

	* Local text generation and chat applications
	* CPU-based or resource-efficient deployments
	* Research, experimentation, and prototyping
	* Offline or self-hosted AI systems

	---

	## Limitations

	* Reduced performance compared to full-precision variants
	* Output quality depends on prompt engineering and inference settings
	* Not fine-tuned for highly domain-specific tasks

	---

	## License

	This model follows the Apache licence 2.0 of the base Granite-4.0 model.
	Users must comply with the licensing conditions defined by the original creators.

	---

	## Support

	If you find this model useful, please consider supporting the project.
	Your support encourages Open4bits to continue releasing and maintaining efficient open models for the community.