|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
# Granite 4 GGUF (4-bit Quantized) |
|
|
|
|
|
This repository hosts GGUF-format quantized versions of **IBM Granite 4** models at multiple parameter sizes. |
|
|
|
|
|
The models provided here are intended for **local inference** and are suitable for use with **SciTools’ Understand and Onboard**, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications). |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- Base models: IBM Granite 4 |
|
|
- Variants provided: Micro 3B |
|
|
- Format: GGUF |
|
|
- Quantization: 4-bit (model-specific, see table below) |
|
|
- Intended use: Local inference, code understanding, general-purpose chat |
|
|
- Languages: English (as supported by Granite 4) |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
## Quantization Process |
|
|
|
|
|
- The **3B Granite 4 Micro** model is quantized using **Unsloth** tooling. |
|
|
- No additional fine-tuning, rebalancing, or prompt modification was applied. |
|
|
- Quantization parameters were not altered from their original sources. |
|
|
|
|
|
These models are redistributed as-is to provide reproducible, efficient GGUF variants suitable for local workflows. |
|
|
|
|
|
--- |
|
|
|
|
|
## What We Did Not Do |
|
|
|
|
|
To be explicit: |
|
|
|
|
|
- No additional fine-tuning |
|
|
- No instruction rebalancing |
|
|
- No safety, alignment, or prompt modifications |
|
|
- No merging or model surgery |
|
|
|
|
|
Any observed behavior is attributable to **Granite 4 and the applied quantization**, not downstream changes. |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
These models are suitable for: |
|
|
|
|
|
- SciTools Understand and SciTools Onboard |
|
|
- Local AI workflows |
|
|
- Code comprehension and exploration |
|
|
- Interactive chat and analysis |
|
|
- Integration into developer tools that support GGUF |
|
|
|
|
|
They are not intended for: |
|
|
|
|
|
- Safety-critical or regulated decision-making |
|
|
- Use cases requiring guaranteed factual accuracy |
|
|
- Production deployment without independent evaluation |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- As 4-bit quantized models, some reduction in reasoning depth and precision is expected compared to full-precision checkpoints. |
|
|
- Output quality varies between the 1B and 3B variants. |
|
|
- Like all large language models, Granite 4 may produce incorrect or misleading outputs. |
|
|
|
|
|
Evaluate carefully for your specific workload. |
|
|
|
|
|
--- |
|
|
|
|
|
## License & Attribution |
|
|
|
|
|
- Original models: IBM (Granite 4) |
|
|
- Quantization: IBM and Unsloth |
|
|
- Format: GGUF (llama.cpp ecosystem) |
|
|
|
|
|
Please refer to the original Granite 4 license and usage terms. This repository redistributes quantized artifacts only and does not modify the underlying licensing conditions. |
|
|
|
|
|
--- |
|
|
|
|
|
## Acknowledgements |
|
|
|
|
|
Thanks to **IBM** for releasing the Granite 4 models and to **Unsloth** for providing efficient, reproducible quantization that enables practical local inference. |
|
|
|