SciTools
/

Granite4.0

Model card Files Files and versions

Granite4.0 / README.md

zenpeach's picture

Update README.md

40e27e9 verified 25 days ago

|

history blame contribute delete

2.64 kB

	---
	license: apache-2.0
	---
	# Granite 4 GGUF (4-bit Quantized)

	This repository hosts GGUF-format quantized versions of IBM Granite 4 models at multiple parameter sizes.

	The models provided here are intended for local inference and are suitable for use with SciTools’ Understand and Onboard, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications).

	---

	## Model Details

	- Base models: IBM Granite 4
	- Variants provided: Micro 3B
	- Format: GGUF
	- Quantization: 4-bit (model-specific, see table below)
	- Intended use: Local inference, code understanding, general-purpose chat
	- Languages: English (as supported by Granite 4)

	---


	## Quantization Process

	- The 3B Granite 4 Micro model is quantized using Unsloth tooling.
	- No additional fine-tuning, rebalancing, or prompt modification was applied.
	- Quantization parameters were not altered from their original sources.

	These models are redistributed as-is to provide reproducible, efficient GGUF variants suitable for local workflows.

	---

	## What We Did Not Do

	To be explicit:

	- No additional fine-tuning
	- No instruction rebalancing
	- No safety, alignment, or prompt modifications
	- No merging or model surgery

	Any observed behavior is attributable to Granite 4 and the applied quantization, not downstream changes.

	---

	## Intended Use

	These models are suitable for:

	- SciTools Understand and SciTools Onboard
	- Local AI workflows
	- Code comprehension and exploration
	- Interactive chat and analysis
	- Integration into developer tools that support GGUF

	They are not intended for:

	- Safety-critical or regulated decision-making
	- Use cases requiring guaranteed factual accuracy
	- Production deployment without independent evaluation

	---

	## Limitations

	- As 4-bit quantized models, some reduction in reasoning depth and precision is expected compared to full-precision checkpoints.
	- Output quality varies between the 1B and 3B variants.
	- Like all large language models, Granite 4 may produce incorrect or misleading outputs.

	Evaluate carefully for your specific workload.

	---

	## License & Attribution

	- Original models: IBM (Granite 4)
	- Quantization: IBM and Unsloth
	- Format: GGUF (llama.cpp ecosystem)

	Please refer to the original Granite 4 license and usage terms. This repository redistributes quantized artifacts only and does not modify the underlying licensing conditions.

	---

	## Acknowledgements

	Thanks to IBM for releasing the Granite 4 models and to Unsloth for providing efficient, reproducible quantization that enables practical local inference.