GGUF
conversational
Granite4.0 / README.md
zenpeach's picture
Update README.md
40e27e9 verified
---
license: apache-2.0
---
# Granite 4 GGUF (4-bit Quantized)
This repository hosts GGUF-format quantized versions of **IBM Granite 4** models at multiple parameter sizes.
The models provided here are intended for **local inference** and are suitable for use with **SciTools’ Understand and Onboard**, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications).
---
## Model Details
- Base models: IBM Granite 4
- Variants provided: Micro 3B
- Format: GGUF
- Quantization: 4-bit (model-specific, see table below)
- Intended use: Local inference, code understanding, general-purpose chat
- Languages: English (as supported by Granite 4)
---
## Quantization Process
- The **3B Granite 4 Micro** model is quantized using **Unsloth** tooling.
- No additional fine-tuning, rebalancing, or prompt modification was applied.
- Quantization parameters were not altered from their original sources.
These models are redistributed as-is to provide reproducible, efficient GGUF variants suitable for local workflows.
---
## What We Did Not Do
To be explicit:
- No additional fine-tuning
- No instruction rebalancing
- No safety, alignment, or prompt modifications
- No merging or model surgery
Any observed behavior is attributable to **Granite 4 and the applied quantization**, not downstream changes.
---
## Intended Use
These models are suitable for:
- SciTools Understand and SciTools Onboard
- Local AI workflows
- Code comprehension and exploration
- Interactive chat and analysis
- Integration into developer tools that support GGUF
They are not intended for:
- Safety-critical or regulated decision-making
- Use cases requiring guaranteed factual accuracy
- Production deployment without independent evaluation
---
## Limitations
- As 4-bit quantized models, some reduction in reasoning depth and precision is expected compared to full-precision checkpoints.
- Output quality varies between the 1B and 3B variants.
- Like all large language models, Granite 4 may produce incorrect or misleading outputs.
Evaluate carefully for your specific workload.
---
## License & Attribution
- Original models: IBM (Granite 4)
- Quantization: IBM and Unsloth
- Format: GGUF (llama.cpp ecosystem)
Please refer to the original Granite 4 license and usage terms. This repository redistributes quantized artifacts only and does not modify the underlying licensing conditions.
---
## Acknowledgements
Thanks to **IBM** for releasing the Granite 4 models and to **Unsloth** for providing efficient, reproducible quantization that enables practical local inference.