fmasterpro27 commited on
Commit
53d1a7e
·
verified ·
1 Parent(s): 9d5004f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md CHANGED
@@ -8,3 +8,68 @@ tags:
8
  - open4bits
9
  base_model: ibm-granite/granite-4.0-micro
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - open4bits
9
  base_model: ibm-granite/granite-4.0-micro
10
  ---
11
+
12
+ # Open4bits / Granite-4.0-Micro-MLX-3Bit
13
+
14
+ This repository provides the **Granite-4.0 Micro model quantized to 3-bit in MLX format**, published by Open4bits to enable efficient local inference with low memory usage and broad hardware compatibility.
15
+
16
+ The underlying Granite-4.0 model and architecture are **developed and owned by their original authors**. This repository contains only a 3-bit quantized MLX conversion of the original model weights.
17
+
18
+ The model is designed for lightweight, high-performance text generation and instruction-following tasks, making it suitable for local and resource-constrained environments.
19
+
20
+ Open4bits has started supporting **MLX models** to broaden compatibility with emerging quantization formats and efficient runtimes.
21
+
22
+ ---
23
+
24
+ ## Model Overview
25
+
26
+ Granite-4.0 Micro is a compact variant of the Granite-4.0 architecture optimized for efficient inference and lower resource footprints.
27
+ This release provides a **3-bit quantized checkpoint in MLX format**, enabling fast inference on CPUs and supported accelerators with reduced memory demands.
28
+
29
+ ---
30
+
31
+ ## Model Details
32
+
33
+ * **Base Model:** Granite-4.0
34
+ * **Variant:** Micro
35
+ * **Quantization:** 3-bit
36
+ * **Format:** MLX
37
+ * **Task:** Text generation, instruction following
38
+ * **Weight tying:** Preserved
39
+ * **Compatibility:** MLX-enabled inference engines and supported runtimes
40
+
41
+ This quantized format balances inference performance with lower resource requirements while preserving core architectural design.
42
+
43
+ ---
44
+
45
+ ## Intended Use
46
+
47
+ This model is intended for:
48
+
49
+ * Local text generation and chat applications
50
+ * CPU-based or resource-efficient deployments
51
+ * Research, experimentation, and prototyping
52
+ * Offline or self-hosted AI systems
53
+
54
+ ---
55
+
56
+ ## Limitations
57
+
58
+ * Reduced performance compared to full-precision variants
59
+ * Output quality depends on prompt engineering and inference settings
60
+ * Not fine-tuned for highly domain-specific tasks
61
+
62
+ ---
63
+
64
+ ## License
65
+
66
+ This model follows the **Apache licence 2.0** of the base Granite-4.0 model.
67
+ Users must comply with the licensing conditions defined by the original creators.
68
+
69
+ ---
70
+
71
+ ## Support
72
+
73
+ If you find this model useful, please consider supporting the project.
74
+ Your support encourages Open4bits to continue releasing and maintaining efficient open models for the community.
75
+