--- license: apache-2.0 --- # Granite 4 GGUF (4-bit Quantized) This repository hosts GGUF-format quantized versions of **IBM Granite 4** models at multiple parameter sizes. The models provided here are intended for **local inference** and are suitable for use with **SciTools’ Understand and Onboard**, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications). --- ## Model Details - Base models: IBM Granite 4 - Variants provided: Micro 3B - Format: GGUF - Quantization: 4-bit (model-specific, see table below) - Intended use: Local inference, code understanding, general-purpose chat - Languages: English (as supported by Granite 4) --- ## Quantization Process - The **3B Granite 4 Micro** model is quantized using **Unsloth** tooling. - No additional fine-tuning, rebalancing, or prompt modification was applied. - Quantization parameters were not altered from their original sources. These models are redistributed as-is to provide reproducible, efficient GGUF variants suitable for local workflows. --- ## What We Did Not Do To be explicit: - No additional fine-tuning - No instruction rebalancing - No safety, alignment, or prompt modifications - No merging or model surgery Any observed behavior is attributable to **Granite 4 and the applied quantization**, not downstream changes. --- ## Intended Use These models are suitable for: - SciTools Understand and SciTools Onboard - Local AI workflows - Code comprehension and exploration - Interactive chat and analysis - Integration into developer tools that support GGUF They are not intended for: - Safety-critical or regulated decision-making - Use cases requiring guaranteed factual accuracy - Production deployment without independent evaluation --- ## Limitations - As 4-bit quantized models, some reduction in reasoning depth and precision is expected compared to full-precision checkpoints. - Output quality varies between the 1B and 3B variants. - Like all large language models, Granite 4 may produce incorrect or misleading outputs. Evaluate carefully for your specific workload. --- ## License & Attribution - Original models: IBM (Granite 4) - Quantization: IBM and Unsloth - Format: GGUF (llama.cpp ecosystem) Please refer to the original Granite 4 license and usage terms. This repository redistributes quantized artifacts only and does not modify the underlying licensing conditions. --- ## Acknowledgements Thanks to **IBM** for releasing the Granite 4 models and to **Unsloth** for providing efficient, reproducible quantization that enables practical local inference.