---
license: apache-2.0
---
# Granite 4 GGUF (4-bit Quantized)

This repository hosts GGUF-format quantized versions of **IBM Granite 4** models at multiple parameter sizes.

The models provided here are intended for **local inference** and are suitable for use with **SciTools’ Understand and Onboard**, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications).

---

## Model Details

- Base models: IBM Granite 4
- Variants provided: Micro 3B
- Format: GGUF
- Quantization: 4-bit (model-specific, see table below)
- Intended use: Local inference, code understanding, general-purpose chat
- Languages: English (as supported by Granite 4)

---


## Quantization Process

- The **3B Granite 4 Micro** model is quantized using **Unsloth** tooling.
- No additional fine-tuning, rebalancing, or prompt modification was applied.
- Quantization parameters were not altered from their original sources.

These models are redistributed as-is to provide reproducible, efficient GGUF variants suitable for local workflows.

---

## What We Did Not Do

To be explicit:

- No additional fine-tuning  
- No instruction rebalancing  
- No safety, alignment, or prompt modifications  
- No merging or model surgery  

Any observed behavior is attributable to **Granite 4 and the applied quantization**, not downstream changes.

---

## Intended Use

These models are suitable for:

- SciTools Understand and SciTools Onboard
- Local AI workflows
- Code comprehension and exploration
- Interactive chat and analysis
- Integration into developer tools that support GGUF

They are not intended for:

- Safety-critical or regulated decision-making
- Use cases requiring guaranteed factual accuracy
- Production deployment without independent evaluation

---

## Limitations

- As 4-bit quantized models, some reduction in reasoning depth and precision is expected compared to full-precision checkpoints.
- Output quality varies between the 1B and 3B variants.
- Like all large language models, Granite 4 may produce incorrect or misleading outputs.

Evaluate carefully for your specific workload.

---

## License & Attribution

- Original models: IBM (Granite 4)
- Quantization: IBM and Unsloth
- Format: GGUF (llama.cpp ecosystem)

Please refer to the original Granite 4 license and usage terms. This repository redistributes quantized artifacts only and does not modify the underlying licensing conditions.

---

## Acknowledgements

Thanks to **IBM** for releasing the Granite 4 models and to **Unsloth** for providing efficient, reproducible quantization that enables practical local inference.